God bless the internet.
Hey, when I was a kid in the 90s if you wanted to see a picture of a raccoon you’d spend all day online waiting for it to load! If you knew the right guy at school though you could get a 1/4" floppy disk already loaded with raccoon pictures so you wouldn’t have to wait so long…
5.25", or 3.5"? We had both at my parents’ house in the 90s, but the standard was 3.5"
I think there may even be an external 8" floppy drive in my parents’ barn.
This would have been around 95 or 96 so definitely 3.5" floppies. The same guy said he could get me bootleg copies of Duke Nukem 3D and Rise of the Triad and they both came on huge stacks of 3.5" floppies that he expected me to give back to him lol
In case anyone wants to see this:
If you wanted to watch a raccoon try to wash cotton candy, you’d have to find a raccoon and give it some cotton candy.
https://i.imgur.com/Jj4WK1a.gif
Now you can just describe a specific raccoon video and be presented with it by a stranger who also loves this video.
Now we can generate one in a few seconds that has never existed jsuf by touching our phone in the right places a few times
TeChNicAlLy it sort of already existed. A diffusion model can only generate an impossibly huge but finite number of images, and the content of these images is determined when the model is created. So when you ‘generate’ images you’re kinda just browsing a catalog
That’s not correct. It allows permutations of concepts it has “identified”. It’s only finite in the sense of being limited by the number of pixels and possible color values.
This is far from what “browsing a catalog” makes it sound like. It’s almost correct if you consider the catalogue as a collection of concepts. But it is literally generating an image based on a prompt that projects those concepts to an image. You can generate something with a combination of concepts, mixed together in a way that were never part of any training set
A racoon with purple fur, wearing sunglasses and a metal viking helmet with ivory horns, the blue planet earth is visible in the starry background
It doesn’t do a perfect job, but I also spent 2 minutes on this.
You missed my point I think. There is a finite number of possible prompts and settings resulting in a finite number of possible images. I wasn’t talking about training sets at all
Besides the point the other commenter already made, I’d like to add that inference isn’t deterministic per model. There are a bunch of sources of inconsistency:
- GPU hardware/software can influence the results of floating point operations
- Different inference implementations can change the order of operations (and matrix operations aren’t necessarily commutative)
- Different RNG implementations can change the space of possible seed images
If you generate with the same prompt and settings you get what I would consider the same image except for tiny variations (they aren’t matching pixel-perfect)
Edit: A piece of paper has a random 3D relief of fibers, so the exact position a printer ink droplet ends up at is also not deterministic, and so no two copies of a physical catalog are identical. But we would still consider them the “same” catalog