• @fidodo
    link
    English
    2
    edit-2
    10 months ago

    I think the main point here is that image generation AI doesn’t understand language, it’s giving weight to pixels based on tags, and yes you can give negative weights too. It’s more evident if you ask it to do anything positional or logical, it’s not designed to understand that.

    LLMs are though, so you could combine the tools so the LLM can command the image generator and even create a seed image to apply positional logic. I was surprised to find out that asking chat gpt to generate a room without elephants via dalle also failed. I would expect it to convert the user query to tags and not just feed it in raw.