Hacker News new | ask | show | jobs
by jjevanoorschot 1121 days ago
I'm on board with assuming that the intent of Google / the engineers who built the classifier is not racist. However the outcome — labelling a black person as a gorilla — certainly is racist. What makes you think otherwise?
4 comments

The reason I don’t think the actual act of the program bugging is racist is I imagine it’s just some stupid rule based on shape (primates have a similar shape) and skin color and isn’t smart enough to distinguish humans vs gorillas vs Gumby toys with black/grey skin closer to a gorilla.

I think intent is important for labeling something racist and a function doesn’t have intent. And it doesn’t seem like the programmer had intent.

So I agree that a human seeing a person and labeling it as a gorilla is racist, it’s because the human is making an inappropriate value judgement.

The reason I don’t think the actual act of the program bugging is racist is I imagine it’s just some stupid rule based on shape (primates have a similar shape) and skin color and isn’t smart enough to distinguish humans vs gorillas vs Gumby toys with black/grey skin closer to a gorilla.

But this is AI -- so the stupid rule is not pre-programmed, but rather curve-fit to the data (uh, "learned").

So ultimately it's a matter of (1) failing to find the right training data (or procedure) and (2) more fundamentally, choosing not to correct the problem after 8 years.

I was generalizing the bug but my basis is an assumption that the programmer didn’t make some conscious or unconscious racist decision, but just something basic like “match shapes and colors” and the training data had a bunch of gorillas for one reason or another.

I think this gets fixed by better training data and more pictures of really dark skinned people. So with more supervised labels of dark skinned people to people, properly so the matching doesn’t think people are closer to gorillas.

Comically/sadly, we’ll know we get closer to fixing the training sets to be more inclusive when google starts labeling gorillas as people.

I think there are some systemic reasons why there aren’t more diverse populations in training data. And those are more society issues than AI issues (ie, rich people are more represented, rich people are certain races, therefore races are more represented).

And finally, I’ve worked in software that people just test what they are and know so I’ve seen so many test plans that are too simple and only test the programmers dob and address. This doesn’t mean racist because all the programmers are Asian males. It just means the quality review wasn’t thorough enough to include proper test conditions.

I might be inappropriately conflating software bugs from different areas but this is what makes me think “stupidity or weakness more likely than racism.”

How is that racist? Because you’re projecting a racist comment onto it, a classifier? That does not make sense to me
The racism comes from the fact non-white people were not properly considered when the model was developed and trained. This comes up time and time again in AI, ranging from face ID that only works on white people, to porn classifiers that associate black people with NSFW images.
No matter how good or well trained on good data with good representation of all skin colors a classifier is, it’s going to misclassify people and things periodically, and it’s definitely going to misclassify black people as gorillas more often than other races.
But, white people get misclassified as animals by the classifier too. Typically white people aren't misclassified as gorillas but as other animals. So i don't think the cause is as simple as non-white people not being considered during training.
It classified 80 photos of the same black person as 'gorilla', I have not heard of that happening with white people.
I saw lots of examples of white children being classified as seals
Have a link to an example or two? I can't find any after a few minutes of searching.
> The racism comes from the fact non-white people were not properly considered

Is this the case? Do we even know for a fact that only non-white people were mislabeled as anything else?

Or are we just, you know, throwing out baseless speculation as fact?

If a system consistently misclassifies persons black persons far more than white persons -- and does so in a way that's obviously provocative and offensive -- then by definition it's racist in its effect (regardless of intent). The fact that the smartest company in the world cannot seem to get a handle on this problem after 8 years is also not unreasonable grounds to suspect that something's up.

Like that they don't appreciate the gravity of the problem, for example.

These are dangerous grounds to discuss, but I don't think it's racist (colloquially) at all. If gorillas were like yetis and covered in white fur and it started labeling anglos as gorillas, it's not racist either. Racism (colloquially) comes from bad people's intentions. Who would've thought that a creature that is very similar to us humans and has a color that matches some humans would accidentally classify something poorly.

What would be racist from this outcome is if it kept doing this and no one did anything. Clearly it hurts people's feelings and that is a very valid issue. Googles option to just nuke it is a great start until they can hammer out the kinks.

Racism (colloquially) comes from bad people's intentions.

Racism can also be measured by its effect, regardless of intent.

What would be racist from this outcome is if it kept doing this and no one did anything.

After 8 years, that's seems to be precisely what's happening.

Isn't the point of the article that it just refuses to recognize gorillas outright? That prevents exactly what you're talking about. And I made that point in my post. It is hurtful so Google prevent google photos from classifying anything as a Gorilla is a good bandaid. Some things are just too risky to solve for little gain.
Racism isn't an objective order existing in the universe separate from us. It's part of human experience and exists where humans experience it.

Given the recent history of equating black people with non-human primates, and using that to deny them rights & full participation in society, making this error is going to be experienced as racist. It's not a matter of individual malice or taxonomic classification, but of history and social relations.

I think we can all agree that the classifier is horribly broken.

But it seems like if nobody is working on this, how will we ever fix this gaping hole in image classifiers? And don't we want to fix it? And to fix it, research will continue to get it wrong until they get it less wrong and more right, but can only iterate without a massive backlash. It seems like being stuck between a rock and a hard place.

I am rhetorically asking, wouldn't we have to allow researchers to iterate on this problem to fix it? That simply won't happen until we are able to allow them leeway understanding that this is an incrementally improving model. Otherwise what we have is just a sledgehammer solution (just banning all primate classifications) which actually never addressed the problem, that these models do have a race-based bias (probably in their input datasets.)

I'm simply answering the question of how it is racist, not currently trying to tackle the appropriateness of fixing the racism or the technical hurdles involved in that. It's outside my expertise and not particularly relevant to the comment I was responding to.
I suppose this could be an example of Popper’s third world.
Because racism is about harm, not an estimation of a thing's motivations and prejudices -- which it's why it's still racist even if you didn't mean it or didn't know. It doesn't actually require a mind at all. Anything that confers, amplifies, or perpetuates harmful stereotypes or negative associations with people of a specific race is racist.

The thing you're calling racism is actually hate speech as it's typically defined in law.

Presumably, the GP considers intent to be the only relevant factor in determining if something is racist.
Leaving the current situation aside, it's an interesting philosophical point. In law you have the concept of "mens rea" https://en.wikipedia.org/wiki/Mens_rea
Intent WAS a factor here. There was no intent to consider anyone other than white people when the model was trained.
Define racism?
That's a good question. The ML bias this isn't necessarily due to one prejudiced person doing this on purpose. It's connected to systemic racism where history and culture added up to the status quo that is biased.

The fact is that training sets usually contain many more white men than black women, especially if they're just scraped off the web. People who guided the training may have just used datasets that reflect their own culture and demographics of their own country, and didn't see a problem with that. The opposite would have been be seen as "pandering to diversity" in their country, so they've ended up with a biased dataset and a biased algorithm.