Hacker News new | ask | show | jobs
by d3ntb3ev1l 1950 days ago
I agree with you but I am still scared of racism.
1 comments

> I agree with you but I am still scared of racism.

My suspicion is that the concern with machine learning over racism is rooted in two things. The first is just the general modern trend of accusing anything you don't like of being racist, because everybody hates racism and wants to fight it. And the second is the fear on the part of people who make a living fighting racism that machine learning might actually put them out of a job.

Because machine learning is basically a paperclip optimizer. You tell it to maximize a thing, it maximizes the thing and minimizes everything else. Racism isn't paperclips, so the paperclip optimizer will optimize for smashing it in favor of making more paperclips. And then they're out of business.

Because when you look at the criticism of this stuff, it generally looks like this. ~12% of the population is black, only ~5% of the selected applicants are black, the algorithm is accused of racism.

But nothing is that simple, because all kinds of things like income and education level and so on correlate with race, so you have to take all of those things into account before you can tell what's going on. And taking into account all of the available data is how machine learning works.

Which isn't to say that you couldn't make an algorithm racist. Tell it to optimize for applicants with a particular skin color and it does. But then your problem isn't with the algorithm, it's with the jackasses who asked for that.

What to optimize for is a much more general and difficult question. (Hint: Not paperclips.)

> My suspicion is that the concern with machine learning over racism is rooted in two things. The first is just the general modern trend of accusing anything you don't like of being racist, because everybody hates racism and wants to fight it. And the second is the fear on the part of people who make a living fighting racism that machine learning might actually put them out of a job.

I don't get to how you go from this statement, to then again explaining exactly how racism is embedded in algorithms. By using the biased data we have in the real world...

It isn't the data that's biased. If you're hiring a computer scientist and disproportionately few black people have a degree in computer science, the data is not lying about who the qualified applicants are and the algorithm can't change that.

To fix that you have to cause more black high school students to go to college and study computer science and then wait two generations until their proportionality in the installed base of qualified computer scientists reaches parity. There is no magic wand that makes it happen overnight.

But concentrating on the places where it can't be solved instead of the places where it can will make it take even longer.

No, the racism is a real issue, though a lot of it is caused by limited training data. Having an image recognition algorithm identify Africans and South Asians as gorillas doesn't happen because the designers intended it, but because their training data had only light-skinned human faces and dark-skinned primates. But the effect is racist even though this wasn't the intent.

Likewise, if the system is trained to duplicate human decision-making (like who gets loans), interesting things can happen: if the decision-makers unconsciously favored whites over blacks, the algorithm could wind up weighing skin color or stereotypically Black or Latino names negatively, meaning that the final model is explicitly racist, just because there is a correlation in the training data. That doesn't mean we shouldn't use deep learning, it means that it's not responsible to just fit the training data and ship without testing for such problems.

> Having an image recognition algorithm identify Africans and South Asians as gorillas doesn't happen because the designers intended it, but because their training data had only light-skinned human faces and dark-skinned primates. But the effect is racist even though this wasn't the intent.

This isn't racism at all. It's just bad PR because humans take the implication that calling black people monkeys is calling them stupid, since that's the implication you would draw if a person did that.

An algorithm doing that is just recognizing that humans and gorillas are both primates:

http://www.aquilaarts.com/bushmonkey.html

And then it's a bug, in the same way that recognizing a black balloon as a balloon but a white balloon as a light bulb is a bug. It has nothing to do with race at all. The algorithm isn't racist against white balloons. The solution is a general increase in the amount of training data, which is what you want in all cases regardless.

> if the decision-makers unconsciously favored whites over blacks, the algorithm could wind up weighing skin color or stereotypically Black or Latino names negatively, meaning that the final model is explicitly racist, just because there is a correlation in the training data.

Except that this is exactly the thing that a paperclip optimizer will smash to bits because it interferes with the goal of making more paperclips.

I’m not an expert in this, but I think racists call black people apes, not just because they think they are stupid, but because they think they are sub-human.

Blacks don’t reach the intelligence and blah to be human. I think that’s what racists drive at when they call someone a monkey, and that’s why it’s so offensive.

It would also make your theoretical AI racist, as it identified blacks as not human.

Honestly, at the end of the day that is what is so difficult about much of this. It’s mostly subjective

> It would also make your theoretical AI racist, as it identified blacks as not human.

That isn't how racism works. It's like saying that an AI that misclassifies a bat as a bird is racist. It's not racism, it's just error.

And it's not a race-specific error, it's a general error for which someone cherry picked the instances that imply a racially motivated intent that doesn't actually exist.

Calling it racism is pointless and misleading because there is no race-specific cause or solution to the problem. The solution is completely identical to the one for the same error in the general case, i.e. get more training data.

This isn't theoretical: Google Photos was identifying Black people as gorillas, and they didn't fix it, they worked around the bug by removing "gorilla" as a possible label in 2018. Some here seem to be saying that we can't call this racist unless someone specifically intended to do this because of hate. It's not subjective when someone's own face is so flagged.

https://www.theverge.com/2018/1/12/16882408/google-racist-go...