Why do "realistic" AI results for man/woman have non-proportion big heads compared to their bodies?

Arc🌰 · 4 hours ago

Why do "realistic" AI results for man/woman have non-proportion big heads compared to their bodies?

@j4k3 · 15 minutes ago

Every model and fine tune is different. With more advanced tools, you just need to negative out the sources. If you have ComfyUI use the manager to get the Derfuu ComfyUI Modded Nodes package and use the debug node to see what is actually happening with sigmas, clip, and conditioning. It is all just numbers. Once you can break these out as numbers you can start understanding the whole process much better.

I like to find information about how people train and tune models using various sources and tag generation tools. This helps to isolate a lot of the obscure issues. Some really helpful negative tags are: ai generated, source unknown, cartoon, NPC, anime, render, practice, noise, almost, plastic, deep fake, Photoshop, overlay

However, tags like these are more common in mixed media that include art. You can often get better results from describing the subject without tags using plain English. All of the alignment issues are also real world adjacent. You can often get better images if you negative out: creepy face, Guy Fawkes mask, clown, stone eyes, roll eyes, morality.

In the positive, you really need to know what datasets and tags were used in training to effectively make keyword terms. However, it really helps to explore a more scientific definition of the human body and what beauty really is. Terms can help like: neoteny, symmetry, small nose mouth brow, big eyes, forward slant face, small joints, delicate, Tanner Stage 5 physical development

You might also add, healthy weight, to the negative as there is a baked in bias. Some models also lock up some aspects of generation behind forbidden fruit which can pull out some aspects in the positive.

A lot of this gets into underage adjacent territory which can make some people uncomfortable, and for that I apologise. That isn’t my intent. The more scientific definition of beauty based on blind testing follows neoteny in humans which is the retention of some juvenile features such as big eyes and a small mouth and nose. These feature sets apply to both sexes. These juvenile features are often what is triggering alignment layers to obfuscate and create oddball outputs. Most uncensored models struggle with this too. Even something like Pony Realism or Flux models will steer the output towards a more rendered cartoon as the output begins to get flagged by the alignment bias layers. If you want to avoid this behavior in a more deterministic way, you really need to understand the scope of alignment bias.

Alternatively, if you define the subject as a synthetic AI human and avoid any even slightly subtle hinting that the person is real or even ‘in the style of’ a real person, it will open up a lot of freedom in what is generated.

Rhynoplaz · 3 hours ago

I imagine that since there are so many examples of characters with large heads throughout art, especially cartoons, it considers oversized head an acceptable variable to play with. The more realistic the art form, the less variation you see, but there are still big headed realistic drawings and paintings.

Natanael · 2 hours ago

Also so many head photos taken from an angle right in front of the face or a bit higher, distorting the shoulders in ways you don’t notice but which matters when the algorithm tries to splice everything together into an averaged model