Hacker News new | ask | show | jobs
by BurningFrog 2811 days ago
Control question for if you're making a certain intellectual mistake.

The data set will also have skewed heavily against people named "David". Probably only ~1% of the successful applicants.

Would you also expect the machine to be biased against candidates named David?

3 comments

What if people named David got hired 10/100 times in the past but people named Denise only got hired 6/100 times?

Hiring practices as expressed in the data get picked up by the machine and applied accordingly. As such, David is predicted to be a better hire than Denise.

This is not about "David" vs. "Denise", but how the machine learning process will aggregate and classify names. David and David-like names will come out on top while obscure names it has no idea how to deal with (0/0 historically) will probably be given no weighting at all.

Sorry "Daud!" Our algorithm says David is better.

I would expect the AI isn't fed names as an input, but rather things Amazon wants to weigh like experience, awards and education.
This isn't correct, the worry isn't that a single group is small, its that a single group is large. (basically if one group is large, you can get by ignoring all the smaller groups).

This is most common with binary problems.