Hacker News new | ask | show | jobs
by gambler 2808 days ago
The article didn't specify how they labeled resumes for training. You're assuming that it was based on whether or not the candidate was hire. Nobody with an iota of experience in machine learning would do something like that. (For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.)

A far more reasonable way would be to take resumes of people who were hired and train the model based on their performance. For example, you could rate resumes of people who promptly quit or got fired as less attractive than resumes of people who stayed with the company for a long time. You could also factor in performance reviews.

It is entirely possible that such model would search for people who aren't usually preferred. E.g. if your recruiters are biased against Ph.D.'s, but you have some Ph.D.'s and they're highly productive, the algorithm could pick this up and rate Ph.D. resumes higher.

Now, you still wouldn't know anything about people whom you didn't hire. This means there is some possibility your employees are not representative of general population and your model would be biased because of that.

Let's say your recruiters are biased against Ph.D.'s and so they undergo extra scrutiny. You only hire candidates with a doctoral degree if they are amazing. This means within your company a doctoral degree is a good predictor of success, but in the world at large it could be a bad criteria to use.

7 comments

I'm not a ML guy, but reading this, it almost sounds like the training data needs to be a fictional, idealized set, and not based on real world data that already has bias slants built in. Possibly composites of real world candidates with idealized characteristics and fictional career trajectories. Basically, what-my-company-looks-like vs what-I-want-it-to-look-like. I'm not sure this is even possible.

Its an interesting questions. On one hand, a practical person could argue: "Well, this is what my company looks like, and these are the types of people who fit with our culture and make it, so be it. Find me these types of candidates."

VS

"I don't like the way may company culture looks, I would rather it was more diverse. This mono-culture is potentially leaving money on the table from not being diverse enough. I'm going to take my current employees, chart their career path, composite them (maybe), tweak some of the ugly race and gender stats for those who were promoted, and feed this to my hiring algorithm."

> the training data needs to be a fictional, idealized set, and not based on real world data that already has bias slants built in

Thatd be great, but in this case (as in most ML cases) the idea is not "follow this known, tedious process" but instead "we have inputs and results but dont know the rules that connect them, can you figure out the rules?"

> this is what my company looks like

In tech hiring, no one wants the team they have...they want more people but without regrets (including regretting the cost)

> You're assuming that it was based on whether or not the candidate was hire. Nobody with an iota of experience in machine learning would do something like that. (For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.)

It's a fine strategy if all you're trying to do is cost-cut and replace the people that currently make these decisions (without changing the decisions).

I agree that most people with ML experience would want to do better, and could think of ways to do so with the right data, but if all the data that's available is "resume + hire/no-hire", then this might be the best they could do (or at least the limit of their assignment).

A reasonable assumption but, in practice, false. Many companies believe (perhaps correctly) that their hiring system is good. Using hiring outcomes would be a reasonable dependent variable, especially if supply is lower than demand, performance is difficult to measure, or there’s a huge surplus of applications which need to be cut down to a smaller number of human assessed resumes.
Men are promoted quicker, and more often, than women.
There was a company meeting one year at Amazon when they proudly announced that men and women were paid within 1-2% of each other for the same roles. It completely missed the point which you raise.

I want to see reports of average tenure and time between promotions by gender. I suspect that the reason we don't see those published is that the numbers are damning.

Or possibly noone did a study of sufficient size that passed peer review.

It's also not hard to make the pay gap 1-2% just like it's not hard to make it 25% (both values are valid). Statistics is a fun field. Don't trust statistics you didn't fake yourself.

Amazon could easily cook the numbers to get to 1-2%, I doubt anyone checked the process of determining that number if it's unbiased and fair and accounts for other factors or not.

I didn't write anything about promotions. I mentioned tenure and performance reviews.

If you had a way to accurately predict that some company would systematically donwrate you and eventually fire you or force you to quit, would you want to interview there? If you were a recruiter in that company and could accurately predict the same, would it be ethical for you to hire the candidate anyway?

This is not to say that I approve of blindly trusting AI to filter candidates, but the overall issue isn't nearly as simple as many comments here make it out to be.

Does it corelate with performance?
And how is performance measured?

Aggressive behavior is considered admirable in men, and deplorable in women. Many women I know have noted comments in their performance reviews about their behavior - various words that can all be distilled to "bitchy".

And then you take your experience, connections and expertise to leave and start your own company where none of this happens.

But is that what we see in real life?

I don't have data or sources at hand, but I'd bet top dollar that F-M ratio among employees is much more lopsided in male favor among founders[0].

[0] Not using the word CEO, because that can be appointed for somewhat arbitrary reasons.

citation needed
downvoters, please explain. The statement makes sense when you look at it in tech where there are more men than women. So it may appear that more men are getting promoted compared to their women counterparts. But that doesn't mean men >>> women, it's just statistics at play.
> For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.

Many companies are fine with false negatives in their hiring process. Better to pass on a good candidate than hire a bad one.

This also means that if you hire unqualified women only because they are women, then your AI will have bias against women.
This seems to assume that performance evaluation is itself free from bias.