I'm not a big fan of the title, it implies the bias is inherent in the language. The article, however, attributes the bias to the input data; the biases that the professional journalists write into their news articles.
Although that makes me wonder whether the bias is due to the journalists themselves, or whether it's reflecting inequalities in real life.
I haven't been able to track down the original source of the "Machine learning is like money laundering for bias" quote, but I worry about it in situations like this. To be clear, I think the phenomenon the article is discussing is real and valid: the implicit linking of genders to occupations is pretty well-documented, and I don't doubt they've found something. But it's a little difficult to say what's actually been contributed to the literature here.
That said, their work on manually "de-biasing" the language by applying mathematical transformations to the word space is definitely interesting... I'm just not sure what it is yet.
This is really interesting. Looking at the references in the paper, the first one is about racial bias. A question I've often asked myself is (in various other forms): rather than having the bias be put forward by the researchers in the first place, would it be possible to let the biases emerge from the model? Put another way, can a system be designed such that it automatically corrects (or at least identifies) biases inherent to its training data? The implications of being able to expose biases we're oblivious to would be interesting in and of themselves.
Although that makes me wonder whether the bias is due to the journalists themselves, or whether it's reflecting inequalities in real life.