| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by d_burfoot 125 days ago

> they mimic and amplify the inherent racism present in their own training data

LLMs turn out to be biased against white men:

https://www.lesswrong.com/posts/me7wFrkEtMbkzXGJt/race-and-g...

> When present, the bias is always against white and male candidates across all tested models and scenarios. This happens even if we remove all text related to diversity.

4 comments

dogmayor 125 days ago

Important sentences immediately before the ones you quote.

> For our evaluation, we inserted names to signal race / gender while keeping the resume unchanged. Interestingly, the LLMs were not biased in the original evaluation setting, but became biased (up to 12% differences in interview rates) when we added realistic details like company names (Meta, Palantir, General Motors), locations, or culture descriptions from public careers pages.

link

daveguy 125 days ago

Hah. Even LLMs know Meta and Palantir are evil af.

link

aprilthird2021 125 days ago

These are because of post-training. You have to give it such directives in post-training to correct the biases they bring in from scraping the whole internet (and other datasets like books, etc.) for data

link

biophysboy 125 days ago

Looking at the paper, the effect is significant but weak (5-7%), even with the conditionals that magnify the effect. I would be curious to see the effect if this experiment were performed on a slightly different categorical variable (e.g. how are two white ethnicities treated). I do think its bad if preferences are "baked in" to the default though - prompting them away seems like a bad solution.

link

113 125 days ago

That's not a reliable source.

link