Hacker News new | ask | show | jobs
by Bartweiss 2620 days ago
This story has been constantly misrepresented, because Reuters absolutely botched their initial report. Amazon was never building a tool to decide which interviewed candidates to hire, they were building a tool for discovering candidates. It was biased, but that gender bias wasn't the proximate reason for scrapping the tool.

As far as I can tell from later stories (e.g. 1, 2), what Amazon actually did was build a tool to show recruiters 'quality' predictions for all resumes, for instance as they scrolled LinkedIn. But they trained it on resumes submitted to Amazon for various positions, possibly also adding weight to resumes which produced hires.

In which case the problem is painfully obvious; the system effectively had no negative training data, and its positive examples (submitted resumes) didn't actually match the desired output (qualified resumes). It was computing degree of similarity between a gender-neutral-ish pool (resumes posted online) and a gender-skewed pool (resumes submitted to Amazon), and tried to make that conversion with whatever data was available - like devaluing resumes that mentioned women's colleges. (This wasn't just a proxy-variable thing, the model essentially learned to weight on gender.) Amazon's team apparently caught this issue and did the usual things like blinding on those words. But they were scared of uncaught factors; reading between the lines, they were unable to "detrain" biases like neural nets do because their dataset and task didn't match.

Ultimately, the tool was apparently scrapped because it made selections "almost at random". Which, again, isn't exactly surprising in light of the absolutely bonkers choice of training examples.

[1] https://www.aclu.org/blog/womens-rights/womens-rights-workpl...

[2] https://www.ml.cmu.edu/news/news-archive/2018/october/amazon...