Hacker News new | ask | show | jobs
by bambax 968 days ago
> Bias occurs in many algorithms and AI systems (...) In an analysis of more than 5,000 AI images, Bloomberg found that images associated with higher-paying job titles featured people with lighter skin tones, and that results for most professional roles were male-dominated.

The use of the term "bias" here is disputable IMHO. What these systems describe is reality.

We should aim to change the world, not the resulting -- faithful -- image of that world in AI. Cure the disease, not the symptoms.

2 comments

Sure, curing the disease is more important than curing the symptoms, though the two aren't entirely unlinked.

What the systems describe isn't reality though. Mexicans invariably wearing sombreros doesn't reflect Mexican fashion, it reflects whether people have bothered to tag the image with "Mexican" or not.

If you can tag reality in ways in which US frat boys' fancy dress preferences are somewhat representative of the label "Mexican" and famous Mexicans in Mexico City usually aren't, then it certainly isn't necessary for job title tags to be highly correlated with ethnicity (Posed stock photos have tended to push back against this for years). And whilst it's true that certain occupations are dominated by white males in the West, they're certainly not the world's "default" people; that's more a reflection of the sort of English speaking internet power users whose content gets hoovered up by the dataset. And that is definitely a bias, even if it's a completely unintentional one.

In general it's "reality as seen through the narrow lens of people uploading and tagging photos, often not even with the intention of conveying useful information to an image generation algorithm". That reality includes a lot of biases, some of them more accurate than others and some of them more benign than others.

My comment above wasn't about Mexicans (that's another comment) but about whether describing people with a high-paying job as having a light skin tone is "biased" or a reflection of reality.

Of course as you say, the problem (if there is one) is in the dataset and not in the program. But if we consider this should be corrected after the fact, then at that moment we are sure to introduce an actual bias.

On what basis? Who decides what bias should be applied, and the appropriate amount?

> What these systems describe is reality.

Not quite. The "reality" (population) that an AI model represents is more like "images of the internet" rather than "population of X country" or "population of the world". (I am comparing a set of images to a set of people on purpose.) Here's a quote from an article about Stable Diffusion [1]:

> For example, the model generated images of people with darker skin tones 70% of the time for the keyword “fast-food worker,” even though 70% of fast-food workers in the US are White. Similarly, 68% of the images generated of social workers had darker skin tones, while 65% of US social workers are White.

> We should aim to change the world, not the resulting -- faithful -- image of that world in AI. Cure the disease, not the symptoms.

Additionally, AI model companies should warn model users that images of "<member of X group>" such as "a Mexican person" are not representative of X group. Nonetheless, I would appreciate an AI model which does something along the lines of crystaln's suggestion [2]:

> [If prompted for "a Nigerian person"] The algorithm could return a random Nigerian ethnic group proportional to their actual population.

(I'm presuming that "a random Nigerian ethnic group" refers to "a member of a random Nigerian ethnic group".)

[1] https://www.bloomberg.com/graphics/2023-generative-ai-bias/

[2] https://news.ycombinator.com/item?id=37964732