Avatar

FactorSD

FactorSD@lemmy.dbzer0.com
Joined
1 posts • 48 comments
Direct message

It does seem to work fairly well, although I will say that it doesn’t fit my workflow at all so I haven’t done a lot of testing. I do think there are some UI things that you could look at though. Engine and Dimensions shouldn’t be minimizable lists, because the fields only take up as much space as the label does. Also, your tooltips are outrageously large, covering about 75% the width of a 1080p monitor which makes them quite hard to actually read.

permalink
report
reply

It’s hard to give precise figures, because there’s always tricks to getting a little more or less but from my (admittedly limited) testing SDXL is significantly more demanding, and 10+GB of VRAM is probably going to be the minimum to run it. I don’t remember exactly what I was doing but I run on an RTX A4500 card, and I managed to max out the 20GB of VRAM just with one SDXL process, where I can normally run a LORA training and 512x768 size images at the same time.

permalink
report
reply

Protip - If an image is good but not quite perfect, stick to the same seed and use the X/Y script to run the image lots of times at different CFG levels.

permalink
report
parent
reply

A lot of the time I try to just let images come out as the AI imagines them - Just running img2img prompts, often in big batches, then picking the pictures that best reflect what I wanted.

But I do also have another process when I want something specific, which involves doing img2img to generate a pose and general composition, flipping that image into both a controlnet (for composition) and a segmentanything mask (for latent couple) and then respinning the same image with the same seed with those new constraints. When you run with the controlnet and the mask you can turn the CFG way down (3 or 4) but keep the coherence in the image so you get much more naturalistic outputs.

This is also a good way to work with LORAs that are either poorly made or don’t work well together - The initial output might look really burned, but when you have the composition locked in you can run the LORAs at much lower strength and with lower CFG so they sit together better.

permalink
report
reply

The real value of SDXL isn’t the higher native resolution, its the improvements in rendering fingers and text and so on. But honestly I have not yet been super impressed by SDXL, in the same way that I want to stay playing the old game with all its DLC and mods. SDXL is good, but until we have the same depth of resources available I am staying with 1.5.

permalink
report
reply

The community will decide what is best by which model they support

permalink
report
parent
reply

I am planning on cooking a LORA today - I’ll give this a go and report back.

permalink
report
reply

I guess YMMV on whether focused is boring or not. I agree that I never really found stimulants to be super interesting, but thats partly because it was too expensive to do coke just to work on whatever project was on my mind.

permalink
report
parent
reply

Most SD stuff requires specific versions of everything, and as you say the documentation is poor even on Windows. Try other forks, and you may get lucky.

permalink
report
reply

How is it meaningfully different to the existing Scribble and Lineart controlnets that are already working in Automatic1111?

permalink
report
reply