Neural networks reveal gender bias in language

Y	Hacker News new \| ask \| show \| jobs

	Neural networks reveal gender bias in language (technologyreview.com)
	13 points by yunque 3609 days ago

3 comments

Jarwain 3609 days ago

I'm not a big fan of the title, it implies the bias is inherent in the language. The article, however, attributes the bias to the input data; the biases that the professional journalists write into their news articles.

Although that makes me wonder whether the bias is due to the journalists themselves, or whether it's reflecting inequalities in real life.

link

Scaevolus 3609 days ago

Many biases are real and self-reinforcing. In the US, doctors are ~70% male and nurses are ~90% female.

link

GFK_of_xmaspast 3609 days ago

"Real" in the sense that any social construct is real.

link

bbctol 3609 days ago

Was this surprising? What exactly did it find?

I haven't been able to track down the original source of the "Machine learning is like money laundering for bias" quote, but I worry about it in situations like this. To be clear, I think the phenomenon the article is discussing is real and valid: the implicit linking of genders to occupations is pretty well-documented, and I don't doubt they've found something. But it's a little difficult to say what's actually been contributed to the literature here.

That said, their work on manually "de-biasing" the language by applying mathematical transformations to the word space is definitely interesting... I'm just not sure what it is yet.

link

skybrian 3609 days ago

The contribution to the literature is showing how to measure and remove one source of bias from word2vec. Were you expecting something else?

link

grownseed 3609 days ago

This is really interesting. Looking at the references in the paper, the first one is about racial bias. A question I've often asked myself is (in various other forms): rather than having the bias be put forward by the researchers in the first place, would it be possible to let the biases emerge from the model? Put another way, can a system be designed such that it automatically corrects (or at least identifies) biases inherent to its training data? The implications of being able to expose biases we're oblivious to would be interesting in and of themselves.

link