Hacker News new | ask | show | jobs
by janalsncm 964 days ago
There’s a saying, “All models are wrong. Some models are useful.”

No matter how granular you get with specific ethnic groups, it’s not possible to capture the long tail of all the types of people who exist, and all of their appearances.

If you ask Midjourney to draw a man, should he be wearing clothes? A man might be naked. Should he have two arms and two legs? Some men don’t. What about two eyes? What color skin should he have?

The fact that Midjourney will never draw a third degree burn victim when simply asked to draw “a man” isn’t a flaw in the model. The model is biased, yes, but it is biased towards utility.

1 comments

It's biased towards uniformity. What we observe in the article above is a distinct lack of variance in the model's output. One way this lack of variance comes across is as cultural bias, but it is also striking how flat and homogeneous are the results, even for 100 generations, given the same prompt. You'd expect some variety- but all the Indian men aren't just 60-year old sadhus, they are all slight variations of essentially the same 60-year old sadhu.

For me, the salient observation is the complete lack of any kind of creativity or anything approximating imagination, of those models, despite a constant barrage of opinions to the contrary. Yes, if you asked me to draw you "a mexican man" (not "person") I'd start with a somberro, moustache, a poncho, maybe a donkey if I was going for a Lucky Luke kind of vibe. But if you asked 100 people to draw "a mexican man" and it turned out they all converged on the same few elements you'd nevertheless have 100 clearly, unambiguously different images of the same kind of "mexican man", often with the same trappings, but each with a clearly distinct style.

It is this complete lack of variance, this flattening of detail into a homogeneous soup, that is the most notable characteristic, and limitation, of these models.

> It is this complete lack of variance, this flattening of detail into a homogeneous soup, that is the most notable characteristic, and limitation, of these models.

And yet when hands came back with beautiful variations in finger count, people were unhappy.