Hacker News new | ask | show | jobs
by danso 3568 days ago
OK, I guess I'm supposed to agree with you if I beg the question that "available data on blacks specifically is completely irrelevant"...? I do think that the distribution of data specific to blacks is relevant.
1 comments

Ok, so now we have all acknowledged that we are "race realists" or "scientific racists" in this conversation. ( https://en.wikipedia.org/wiki/Scientific_racism )

Anyway we've now accepted blacks and whites may behave differently. For example, lets suppose we have all the training data we need to accurately recognize that one race doesn't pay back their loans as much as others, all else held equal.

What should we do about it? Concretely, how many bad loans should we issue in the name of "fairness"? How large a subsidy must the responsible races pay to the deadbeat ones?

I don't know if I nor Dr. King Jr. have to subscribe to scientific racism just because we subscribe to the reality that folks with of different racial backgrounds have a higher probability of being shortchanged historically. And thus, that any machine learning approach that doesn't factor this in will risk perpetuating such disadvantages, which kind of defeats the ostensible purpose for using machine learning to apply public policy in the first place.
Historically isn't the issue. The issue is a simple factual question of whether, all else held equal, black people repay their loans at the same rate as whites in identical financial circumstances. The fact that in aggregate financial circumstances might be different isn't important to this question.

If they do, then you don't need to worry about algorithms discriminating. Insofar as they do it's merely a sampling error (i.e. shrinks like O(1/sqrt(N)), where N = Nwhite + Nblack) and they are just as likely to discriminate in favor as against.

If they don't, then you subscribe to scientific racism, or the belief that blacks and whites in identical circumstances behave fundamentally differently.

(I describe these different cases in explicit detail here: https://www.chrisstucchio.com/blog/2016/alien_intelligences_... )

So do you believe race affects reality independent of other factors? And assuming you do subscribe to scientific racism, what should we do about it?

> The issue is a simple factual question of whether, all else held equal, black people repay their loans at the same rate as whites in identical financial circumstances

Oh if you put it that way, then I don't know. Because that's not the reality that's being dealt with, in which whites and blacks have identical circumstances. I think you're reading something into this that others aren't.

Because that's not the reality that's being dealt with, in which whites and blacks have identical circumstances.

Of course it is. There may be 5 blacks and 100 whites with a credit score of 830. But as long as blacks and whites with an 830 credit score behave the same, then data from whites will generalize to blacks and the problem tlb brought up doesn't apply. Redundant encoding is also irrelevant - this is useless information so an accuracy maximizer has no reason to pay any attention.

Insofar as blacks and whites with an 830 credit score behave differently, then algorithms might treat them differently. That's the "race realism" hypothesis.

Having the same credit score, or other data, does not mean they have identical circumstances. For example, your willingness to follow the rules of society, even to your detriment, might be a result of whether society's rules have treated you fairly in the past.

So belief that you'll get different default rates given some financial data does not imply that you are a scientific racist. To create identical circumstances you'd have to do a brain swap (and some other relevant internal organs) on some black and white infants. A scientific racist view is that then the likelihood of paying off the loans would follow the brain, not the skin.

FWIW, I present a case where race does not directly cause increased failure to repay, but common approaches to modeling could discriminate against race.

These issues have been discussed in detail in statistical considerations of Simpson's paradox. One need not accept that racial differences directly affect an outcome of interest, in order to be concerned about a model being biased against race!