Hacker News new | ask | show | jobs
by AnthonyMouse 2438 days ago
The elephant in the room is that the real way to tell whether that is the case would be to use race as a factor the same as age or sex. If African Americans are more careful drivers then that would detect it and take it into account.

But then you have to take the bad with the good. If it turns out that strict adherence to traffic laws that nobody else abides is actually more dangerous than following the normal flow of traffic, it would also detect that and take it into account.

2 comments

Well it may be the case that they accidentally have a proxy for race already in their data (the "this ethnicity prefers red cars" hypothetical in the article above.) So race may already in practice be factored in despite nobody intending for that to be the case (assuming nobody anticipated that a particular metric is a racial proxy.) That does not necessarily mean it's being unfair to that race though. It could, hypothetically, mean that it's actually being fair to that race, advantaging them in a system/society that would otherwise disadvantage them.
The whole "proxy for race" thing is such a mess.

The original problem was that racists were not just taking race into account but were disproportionately penalizing certain races. They would literally just refuse to do business with black people. And then once that was made illegal, they would refuse to do business with people from black neighborhoods (redlining), i.e. use location as a proxy for race so they could continue to refuse to do business with black people under that pretext.

Normal Bayesian statistics doesn't do that because it's missing the actually racist piece of it, which is giving disproportionate weight to race (or something that correlates with race) so that you refuse disproportionately many people of a particular race for no legitimate reason.

The unfairness never came from taking into account some factor that correlates with race, or even race itself to the extent that it actually correlates with outcomes. It came from using a factor to deny service even though it didn't correlate with outcomes, or if it did then still not proportionately to the huge negative weight assigned to it. It came from giving race, or a proxy for race, disproportionate weight. Giving it proportionate weight isn't unfair, it's the only thing that is fair.

I fully agree and admire the skill with which you've stated this.
"then that would detect it and take it into account."

You have a method for automatically deriving causal relationships from correlational data?

A lack of a causal relationship wouldn't matter in that case. If something correlates with the outcome then it allows you to better predict the outcome even if it isn't the cause, because it at least correlates with the cause or it wouldn't correlate with the outcome.

Though obviously if it isn't the cause then you're better off taking into account the true cause rather than only the thing that correlates with it -- which would cause the correlation with the outcome to disappear for the non-causal factor when you take into account both.