| Even in models where race doesn't directly cause an outcome, a model's judgements may be biased against a race. For example, suppose that (1) people can be green or blue, (2) green people tend to live in Idaho, (3) living in Idaho is associated with people not paying back loans. A linear model where there are only non-zero, positive coefficients for the path p(green) -> p(Idaho) -> p(fail_to_repay), and p(credit_score) -> p(fail_to_repay) will create trouble, even though color does not directly affect repayment. If you use a multiple regression with fail_to_repay ~ B0 + B1Idaho + B2credit_score, it will discriminate against green people, by penalizing people from Idaho. AFAIK, one of the points of the paper linked in the parent comment is that blindly using indicators like IP address may indirectly lead to discrimination against a racial group in this way, e.g. p(racial_group) -> p(a_specific_IP_address). Maybe more relevant to your example, though, is that assuming whites and blacks have the same model in the "ground-truth" scenario I presented could cause a model to be discriminative (when it shouldn't be, because the coefficient for the path from p(green) -> p(fail_to_repay) is 0). This specific issue is hairy, and exists for traditional approaches also. |
This is a case like what is described in the article - when a perfect predictor (another word for this is "reality" or "hindsight") will still exhibit disparate impact.