| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by BurningFrog 2811 days ago

Control question for if you're making a certain intellectual mistake.

The data set will also have skewed heavily against people named "David". Probably only ~1% of the successful applicants.

Would you also expect the machine to be biased against candidates named David?

3 comments

astrodust 2811 days ago

What if people named David got hired 10/100 times in the past but people named Denise only got hired 6/100 times?

Hiring practices as expressed in the data get picked up by the machine and applied accordingly. As such, David is predicted to be a better hire than Denise.

This is not about "David" vs. "Denise", but how the machine learning process will aggregate and classify names. David and David-like names will come out on top while obscure names it has no idea how to deal with (0/0 historically) will probably be given no weighting at all.

Sorry "Daud!" Our algorithm says David is better.

link

kareemsabri 2811 days ago

I would expect the AI isn't fed names as an input, but rather things Amazon wants to weigh like experience, awards and education.

link

joshuamorton 2811 days ago

This isn't correct, the worry isn't that a single group is small, its that a single group is large. (basically if one group is large, you can get by ignoring all the smaller groups).

This is most common with binary problems.

link