I used stable diffusion to create pictures of… things.
Would even go as far as betting they create illustrations of whatchamacallits
I’ve been using Stable Diffusion (via Automatic1111) for a long time, I’ve become fairly adept at it. Recently Bing’s Dalle-3 has surpassed it in terms of composition and instruction-following, but I still find it really important for doing “finishing” work on Dalle-3’s outputs so I don’t expect to stop using it any time soon.
Lately I’ve been experimenting with Koboldcpp and locally-run large language models. I’ve been coming up with little ideas for scripts and programs that use its API to do stuff.
You can use stable diffusion to alter existing images? I somehow never realized that. What ui do you use?
He mentioned he uses automatic1111
The stable diffusion mode for working with existing images is called img2img
Yup. It has a couple of different ways of doing img2img work. The most basic img2img just uses an existing image as a “starting point” and creates whole new images based on it. You can also do targeted “inpainting”, which lets you paint a mask onto the image and then it only regenerates that bit, trying to keep it blended seamlessly into the unchanged parts of the image around it. And then there’s ControlNet, which is an additional layer of processing that takes an input image and analyzes it, trying to create outputs that match what it “understands” to be there rather than just what the visual appearance of the source image is. So for example you could take a photo of someone in a particular pose and then generate new images of completely different characters who are also in that same pose.
Automatic1111 takes some technical fiddling to get set up, and you’ll need to download models for it that match your needs (Civitai is a good source), but it’s really neat how I can play around with stuff. A few days back I made this image of a naga for a D&D campaign by crudely splicing together photos of two different snakes, a woman’s face, and some sheep horns in Gimp and then doing repeated passes through inpainting to clean everything up and get each bit exactly right. Took hours but this is the best example I’ve done yet of picturing something in my mind and then generating an image that matches it almost exactly. I’m rather proud of it.
I once used Craiyon.com to generate an image of an NPC for an online D&D game I was DM’ing. (And if you zoomed in too far, you could see it was a little fucked up.) Aside from that, none.
Just ChatGPT so far.
I did have Dall-E paint me a picture of “a mouse jumping a motorcycle through a flaming ring made of stone while pursued by vaguely ninja-like evil henchmen characters”
It ended up being this: https://i.imgur.io/ArDk1e1_d.webp?maxwidth=640&shape=thumb&fidelity=medium
Which makes me really, really want this as a video game. Just riding the motorcycle through various environments with ninjas popping out left and right trying to grab you. Sometimes they’ve got nunchucks, sometimes nets, sometimes they swing down on a rope to get you. You get power ups too like little bombs you can throw.
But that’s the only time I used the image generation. Mostly I’ve been having GPT-4 explain history and technology to me.
I’ve been trying stable diffusion, but even with downloaded models, nothing I make looks even CLOSE to the quality of bing image creator, with the same prompts. I don’t know what I’m doing wrong.