Hacker News new | ask | show | jobs
by Tainnor 968 days ago
AI systems have a tendency to overamplify human biases and stereotypes to the point that it looks ridiculous even to most (not particularly "woke") humans.

If you told an actual artist to draw 5 pictures of Indian people, I doubt you'd get 5 old men with Turban and beard. Most people understand that reality is more varied than this.

This reminds me of a paper my former coworker wrote about how Google Translate, a couple years ago, would misapply gender stereotypes to gendered nouns in a way that humans wouldn't. The world "table" translates to German as "Tisch" (where you eat; masculine) or as "Tabelle" (in a spreadsheet; feminine). It turned out that when accompanied by an adjective stereotypically associated with masculinity (e.g. "strong"), the system would translate "table" as "Tisch", but in the presence of a stereotypically feminine adjective (like "soft"), it would pick "Tabelle". This is ridiculous, no human translator (not even the most sexist) would do that, as we understand that grammatical gender isn't biological or sociological gender. But the AI system somehow can't say "I don't know what the translation is, it's ambiguous" and so it just makes up a pattern where there should be none.

1 comments

> If you told an actual artist to draw 5 pictures of Indian people, I doubt you'd get 5 old men with Turban and beard. Most people understand that reality is more varied than this.

You have to keep in mind that with these models, it's not like asking an artist to draw 5 pictures of something - it's like asking 5 different artists, who don't know about each other, to each draw a single picture of something.

Generated images are independent, there's no system there to notice it's generating multiple images from one prompt, and thus might want to ensure they're not too similar. I hear OpenAI is hacking around this with DALL-E 3 by having the prompt preprocessor (GPT-4 expanding your prompt) inject stuff like "diverse people" many times in the expanded prompt, to bias things the other way.

> I hear OpenAI is hacking around this with DALL-E 3 by having the prompt preprocessor (GPT-4 expanding your prompt) inject stuff like "diverse people" many times in the expanded prompt, to bias things the other way.

I just asked GPT-4 for images of an Indian man, and it created four separate prompts to pass to Dall-E.

  1. Photo of an Indian man wearing traditional attire, standing against a scenic backdrop with a serene expression.
  2. Oil painting of an Indian man in a kurta, playing a sitar under a banyan tree.
  3. Illustration of an Indian man in modern clothing, holding a cup of chai while reading a newspaper in a bustling city.
  4. Watercolor painting of an Indian man practicing yoga in a tranquil setting near a river.
When asking for "Show me photos of diverse Indian men" the prompts become:

  1. Photo of three Indian men from different regions, each wearing distinct traditional attire, standing side by side in a vibrant market setting. (The resulting image literally looks like triplets in different attire)
  2. Photo of a group of Indian men from various descents, engaging in a conversation at a local tea stall.
  3. Photo of young and elderly Indian men, representing diverse backgrounds, enjoying a game of chess in a park.
  4. Photo of Indian men of diverse ages and regions participating in a traditional dance ceremony. (This one was funny. It was a bunch of Indian men sitting with their legs crossed with one Indian man in a cross legged position floating above all the rest)
I actually think talking to 5 independent artists to draw an Indian man would still produce wildly different depictions than a model, and that's because... well I don't think of turban == indian personally. I think of a brown guy with thick black beard and hair in a t-shirt and jeans... because I work in tech and that's like 90% of the Indian guys I work with. I can imagine 5 different artists would themselves have 5 different ideas of what a generic Indian guy would look like.