Have you noticed? The “realistic” or “realism” AI results for man/woman have apparent big heads (mostly even huge) compared to their own bodies. It’s like the camera is close to the face. Even the results in Google Search have such phenomena. The obvious ones are for Asian women. Why as such for perhaps even the different AI’s, why the same pattern from them?

Try searching for such now, try something like: “realistic AI Asian woman”, and you’ll even see ones with actual HUGE heads, most especially the “realistic” ones. Different AI programs and yet the same huge head pattern from them?

Any actual AI users, could you please share your experience too? Thanks in advance for any enlightenment.

  • @j4k3
    link
    English
    115 minutes ago

    Every model and fine tune is different. With more advanced tools, you just need to negative out the sources. If you have ComfyUI use the manager to get the Derfuu ComfyUI Modded Nodes package and use the debug node to see what is actually happening with sigmas, clip, and conditioning. It is all just numbers. Once you can break these out as numbers you can start understanding the whole process much better.

    I like to find information about how people train and tune models using various sources and tag generation tools. This helps to isolate a lot of the obscure issues. Some really helpful negative tags are: ai generated, source unknown, cartoon, NPC, anime, render, practice, noise, almost, plastic, deep fake, Photoshop, overlay

    However, tags like these are more common in mixed media that include art. You can often get better results from describing the subject without tags using plain English. All of the alignment issues are also real world adjacent. You can often get better images if you negative out: creepy face, Guy Fawkes mask, clown, stone eyes, roll eyes, morality.

    In the positive, you really need to know what datasets and tags were used in training to effectively make keyword terms. However, it really helps to explore a more scientific definition of the human body and what beauty really is. Terms can help like: neoteny, symmetry, small nose mouth brow, big eyes, forward slant face, small joints, delicate, Tanner Stage 5 physical development

    You might also add, healthy weight, to the negative as there is a baked in bias. Some models also lock up some aspects of generation behind forbidden fruit which can pull out some aspects in the positive.

    A lot of this gets into underage adjacent territory which can make some people uncomfortable, and for that I apologise. That isn’t my intent. The more scientific definition of beauty based on blind testing follows neoteny in humans which is the retention of some juvenile features such as big eyes and a small mouth and nose. These feature sets apply to both sexes. These juvenile features are often what is triggering alignment layers to obfuscate and create oddball outputs. Most uncensored models struggle with this too. Even something like Pony Realism or Flux models will steer the output towards a more rendered cartoon as the output begins to get flagged by the alignment bias layers. If you want to avoid this behavior in a more deterministic way, you really need to understand the scope of alignment bias.

    Alternatively, if you define the subject as a synthetic AI human and avoid any even slightly subtle hinting that the person is real or even ‘in the style of’ a real person, it will open up a lot of freedom in what is generated.

  • Rhynoplaz
    link
    83 hours ago

    I imagine that since there are so many examples of characters with large heads throughout art, especially cartoons, it considers oversized head an acceptable variable to play with. The more realistic the art form, the less variation you see, but there are still big headed realistic drawings and paintings.

    • Natanael
      link
      fedilink
      42 hours ago

      Also so many head photos taken from an angle right in front of the face or a bit higher, distorting the shoulders in ways you don’t notice but which matters when the algorithm tries to splice everything together into an averaged model