| HN Mirror

> And with machine learning we're building our biases into the algorithms explicitly; see all those "resume parsers are racist" articles.

It's even worse than that, because it's so hard to tell if this is even happening.

Suppose guidance counselors in predominantly black schools are telling kids to focus on athletics and the ones in predominantly white schools tell them to focus on intellectual extracurriculars. Then a resume parser sorts people with athletics listed into physical jobs and people with intellectual pursuits listed into intellectual jobs, which of course results in the black applicants getting callbacks for the lower paying jobs.

This is pretty clearly the guidance counselors causing the disparity rather than the algorithm, but we only know that because it was stipulated in the hypothetical. In real life you may not have enough data to be able to discern the underlying cause. In other words, you don't know what the baseline racial disparity is based on all of the non-racial factors that correlate with race, so you don't know if the problem is in the algorithm or was caused somewhere upstream and the algorithm is only producing the accurate outputs for its inputs.

In theory you can evaluate this by checking up on how the candidates hired do compared to how the algorithm predicted they would do, but that's a noisy signal (what does "doing well" mean?), you might not have a large enough sample size to get meaningful results, and it has a lag time of potentially several years, by which point you may already be using a different algorithm. It's a hard problem.