|
|
|
|
|
by timacles
932 days ago
|
|
You’re partially correct, but this isn’t an explanation for why they’re rendered wrong Hands are extremely complicated mechanically. They are the most complex creation evolution has come up with and part of the reason humans are able to do what they do. Hands are like the chess game of anatomy, each segment of a hand has so many permutations that an AI simply doesn’t have enough reference info to animate it properly |
|
Generative models, arguable, have little trouble with complexity given enough training data. Faces are a perfect example. We both agree that image models, at least, lack that data for hands.
But there are many complex things that image models render with sparse training data which don't set off our perception as strongly. Hands fall into the uncanny valley: we are deeply familiar with them.
This is why I mention lighting and focus. They are subtle and complex. Additionally, image models have tons of training examples of each. That's still not enough for generative image models to consistently represent photography in a way that a person who has spent the time to build an accurate model of how camera images look would be fooled. But it fools most people.
The complexity of handling good lighting and focus involve both the generation of the entire scene that the photograph is taking place within and an accurate model of both the design of the camera and how it's been configured for the shot. Both of these are large spaces full of hidden variables that popular image models are not presently trained on.
Many people know you can look at the background of a generated image to identify irregularities. Checking that the lighting has a consistent angle (or multiple angles indicative of a cogent set of scene lights) is another good check. Additionally, if you have an eye for bokeh then when it appears in an image you can often detect whether it's faked. Finally, even smooth blurs often do not reflect either a physically plausible background being blurred or a consistent focal plane cutting through the 3d scene. All additional complexities that image generating models often don't have mastery over (for now). But also many judges of their outputs don't either, so it's easy to miss these "mistakes".