I have been struggling with getting a decent image from SD within fewer iterations.
I have played around with different sampling methods, CFG values, and steps. But unable to find a consistent configuration that gives me decent images.
Simple prompts that I am struggling with:
a photo of a puppy, intricately detailed, realistic
drawing of a bowl of fruits, manga style
If I am unable to get good output for simple prompts, I am afraid the output for more complex or abstract prompts will be completely unusable.
Are there any tricks that can reduce the iterations to give decent images? Any guidance would be really appreciated.
Thanks!
The really dynamite images you see on social media are generally not generated with the default models.
If you’ve messed with the iterations and cfg and still aren’t getting anything close to what you want, I’d try tweaking prompts and for sure I’d be trying the same prompt across multiple models.
Okay. I did not play with anything beyond the default models. Any suggestions for non-default models?
As for tweaking prompts, yes, I am already doing it. But I still made the post to be sure I am not missing anything.
Thanks for your answer and suggestions. :-)
I’d check out https://civitai.com/ and see if any seem to align with what your intentions are.
Basically, people will augment the training of existing models through various means.
One of your prompts said something about anime or manga and I am aware that there are many models trained specifically to be good at that.
Wow! This might help me a lot. I will spend some time trying out a few models. Thanks again for the resource. :-)
I’ve been curious if you’ve had any better luck with a different model?
I actually was able to generate much better images and learn about LoRA, hypernetworks, etc. thanks to the website you shared. :-)
Having said that, Stable Diffusion is a bit too cumbersome and tedious when compared to MidJourney. But it is FOSS, and easier to get started with thanks to tools from Automatic1111.
Looking forward to SDXL, hopefully it alleviates some of the pain points.
I haven’t tried midjourney so I don’t really have a frame of reference. After some practice and getting a good flow going, I was surprised how quickly I could get things to a point that I liked them. Does midjourney do in painting? I’ve been using SD for creating game assets. I’m a trash artist but if I scribble a shit version and img2img with a batch size of like 20… I almost always get something REALLY close to what I need. My usecases are super dependent on inpainting so it’s kinda a must have.
There are a few other tools that leverage SD. If you’re curious look into retro diffusion for aesprite, super neat workflow there
I don’t think MJ does inpainting yet, or at least in an accessible way like SD.
I haven’t used Aseprite but Retro Diffusion looks really cool and useful.
I was initially trying to generate retro/pixel art with the help of prompts, but it was mostly hit or miss. I then found a few webui extensions, like sd-webui-pixelart, that got me closer to the goal.
OK well if that’s what you’re looking for, I can at least tell you about what I had luck with:
For backgrounds, I would usually start with a prompt and I would generate like 30 or 40 in a batch. Then I skim them to see if any are kinda in the zone. Sometimes you can have a good prompt but just not a great seed, so blasting a big pile of them out per prompt is a way to really establish how in-line your prompt is.
Then, if I find one, or some, that look along the lines of what I’m looking for, I usually want to make some more direct changes to… Get a lot more hands on.
I fire up my image editor (Gimp) and I do like the SHITTIEST hack job (not an artist) of drawing in how I want it to be different. Laughably bad drawings. Barely better thank stick men. Think “if I squinted hard I could maybe imagine this blob to be what I want”
Then I take that massacred image back to img2img for inpainting. Mask the parts where I want it to try again. Again, I’ll order up like 20 in a batch. Find the one that most closely aligns with what’s in my head, and then maybe iterate off of that version.
IMHO I think obsessing over prompts is overrated. Broad strokes and inpainting… Taking kind of a “genetic algorithm” approach to zeroing in on what you actually want is far superior of a workflow IMHO