Hacker News new | ask | show | jobs
by yummyfajitas 3568 days ago
If I understand your model right, you are saying that Idahoans don't repay loans and your model accurately reflects this. This isn't a bias at all. The model is issuing fewer loans to green people not because they are green but because they live in Idaho and are unlikely to pay back said loans.

This is a case like what is described in the article - when a perfect predictor (another word for this is "reality" or "hindsight") will still exhibit disparate impact.

1 comments

It is a bias if you calculate the cost to people taking out loans, based on color. Green people will pay a higher cost, even though in the ground-truth model their race is not directly related to loan repayment.

For example, if only blue people in Idaho fail to repay loans, green people will still absorb a greater cost in the multiple regression case above (in the sense that they are more likely to be penalized for being Idahoans).

Yes, if it's actually (blue & Idaho) ~> default, and your model ignores blue, then the greens will pay a higher cost. If color is redundantly encoded then your model can partially fix this and penalize the blue's in Idaho.

Do you consider this situation unjust? If so, you might be unhappy to learn that the entire goal of the field of algorithmic fairness is to do something along these lines.

> Available data on blacks specifically is completely irrelevant if blacks and whites aren't fundamentally different. The white model will generalize.

I should have been more clear that I was responding to this part of your comment. That even if blacks and whites aren't fundamentally different (in the sense that your race does not directly cause an outcome of interest) you can produce biases that are essentially a misatrribution about the relationship between race and that outcome. Worse, if there_is_ a relationship you can reverse the direction a model estimates for the relationship (Simpson's paradox).

> Do you consider this situation unjust? If so, you might be unhappy to learn that the entire goal of the field of algorithmic fairness is to do something along these lines.

I don't think the creation of tools to accommodate this specific purpose is bad, per se. Whether or not they are the appropriate tool to use is a different question.