Hacker News new | ask | show | jobs
by aruss 964 days ago
The problem with this specific instance is that the images generated mix and match characteristics that are unique to those ethnic groups. So the models are reducing real ethnic differences to simple stereotypes. In other words, the models are wrong, and wrong in ways that eliminate diversity.
3 comments

There’s a saying, “All models are wrong. Some models are useful.”

No matter how granular you get with specific ethnic groups, it’s not possible to capture the long tail of all the types of people who exist, and all of their appearances.

If you ask Midjourney to draw a man, should he be wearing clothes? A man might be naked. Should he have two arms and two legs? Some men don’t. What about two eyes? What color skin should he have?

The fact that Midjourney will never draw a third degree burn victim when simply asked to draw “a man” isn’t a flaw in the model. The model is biased, yes, but it is biased towards utility.

It's biased towards uniformity. What we observe in the article above is a distinct lack of variance in the model's output. One way this lack of variance comes across is as cultural bias, but it is also striking how flat and homogeneous are the results, even for 100 generations, given the same prompt. You'd expect some variety- but all the Indian men aren't just 60-year old sadhus, they are all slight variations of essentially the same 60-year old sadhu.

For me, the salient observation is the complete lack of any kind of creativity or anything approximating imagination, of those models, despite a constant barrage of opinions to the contrary. Yes, if you asked me to draw you "a mexican man" (not "person") I'd start with a somberro, moustache, a poncho, maybe a donkey if I was going for a Lucky Luke kind of vibe. But if you asked 100 people to draw "a mexican man" and it turned out they all converged on the same few elements you'd nevertheless have 100 clearly, unambiguously different images of the same kind of "mexican man", often with the same trappings, but each with a clearly distinct style.

It is this complete lack of variance, this flattening of detail into a homogeneous soup, that is the most notable characteristic, and limitation, of these models.

> It is this complete lack of variance, this flattening of detail into a homogeneous soup, that is the most notable characteristic, and limitation, of these models.

And yet when hands came back with beautiful variations in finger count, people were unhappy.

Ethnic differences among groups that aren't extremely isolated are mostly gibberish perpetuated with cultural identity politics.

Offspring tends to go in one direction or another so one group may end up inbred, but a mix is more accurate than compiling the beliefs of human cultures about their genetic traits and isolation from each other.

Actually no, they are reducing different ethnic groups to what is becoming the current norm: everyone being mixed.

Inside cities, no one is specifically looking for a member of their own tribe to marry so the ability to identify ethnic groups by facial features is collapsing.