| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ThrustVectoring 2379 days ago

The real problem is that there are mutually exclusive desiderata from your models.

1. If you have two people with identical relevant behavior and different races, you want the model to score them identically.

2. Each race should receive a comparable distribution of scores.

3. The scores should be as accurate of a predictor of ground facts as possible.

Relax the first desiderata and your model is now either explicitly or implicitly (via irrelevant proxy variables) using race to determine results, opening you up to racial discrimination lawsuits. Relax the second desiderata and your model is now creating disparate impact across racial groups, opening you up to racial discrimination lawsuit. Relax the third and you're leaving accuracy, and thus money, on the table.