|
|
|
|
|
by codefreeordie
1496 days ago
|
|
It sometimes makes sense to scrub race/ethnicity/gender information from certain types of data, typically when a human is going to be making individual decisions. For example, not having race data on resumes is generally productive, because that categorization can't provide a meaningful input to the decision associated with an individual person. Even if it were to be the case that there was some correlation between race and skill at whatever job you're interviewing for[1], the size of the effect is almost certainly small, and in the meanwhile you've also controlled for any bias in the person doing the reviewing. If you're having a machine look at a dataset, and the machine determines that race or ethnicity is a material factor in determining some attribute in that dataset, you're not doing anybody any good by denying that fact and destroying the result. [1]Let's ignore for the purposes of this discussion, fields (like certain sports) where extreme competition combines with a position heavily dependent upon racially-linked physical characteristics. Though even in this case, there is still a (different, weaker) argument for suppressing race data in "resumes" (yes, I know, ballplayers don't submit resumes to their local NBA franchise) |
|
If the outcome that you're trying to predict is also affected by perceptions of race, you've built a gossip feedback loop.