| The article didn't specify how they labeled resumes for training. You're assuming that it was based on whether or not the candidate was hire. Nobody with an iota of experience in machine learning would do something like that. (For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.) A far more reasonable way would be to take resumes of people who were hired and train the model based on their performance. For example, you could rate resumes of people who promptly quit or got fired as less attractive than resumes of people who stayed with the company for a long time. You could also factor in performance reviews. It is entirely possible that such model would search for people who aren't usually preferred. E.g. if your recruiters are biased against Ph.D.'s, but you have some Ph.D.'s and they're highly productive, the algorithm could pick this up and rate Ph.D. resumes higher. Now, you still wouldn't know anything about people whom you didn't hire. This means there is some possibility your employees are not representative of general population and your model would be biased because of that. Let's say your recruiters are biased against Ph.D.'s and so they undergo extra scrutiny. You only hire candidates with a doctoral degree if they are amazing. This means within your company a doctoral degree is a good predictor of success, but in the world at large it could be a bad criteria to use. |
Its an interesting questions. On one hand, a practical person could argue: "Well, this is what my company looks like, and these are the types of people who fit with our culture and make it, so be it. Find me these types of candidates."
VS
"I don't like the way may company culture looks, I would rather it was more diverse. This mono-culture is potentially leaving money on the table from not being diverse enough. I'm going to take my current employees, chart their career path, composite them (maybe), tweak some of the ugly race and gender stats for those who were promoted, and feed this to my hiring algorithm."