Hacker News new | ask | show | jobs
by practice9 2832 days ago
> fighting algorithmic racism

Reminds me of how Google Photos couldn't differentiate between a black person & a monkey, so they've excluded that term from search altogether.

While the endeavour itself is good, fixes are sometimes hilariously bad or biased (untrue)

4 comments

> Reminds me of how Google Photos couldn't differentiate between a black person & a monkey, so they've excluded that term from search altogether.

Technically that is what happened, but it paints an incorrect picture in people's minds. Out of the billions of images that Google Photos had auto-tagged, it tagged one picture of two black people as "gorillas".[1] This was probably the first time this had ever happened. (If it had happened before, it surely would have been spread far and wide by social media & the press.)

So Google's classifier was inaccurate 0.0000001% of the time, but the PR was so bad that Google "fixed" the issue by blacklisting certain tags (monkey, gorilla, etc). If you take photos of monkeys, you'll have to tag them yourself.

I'm sure Google could do better, but the standard required to avoid a PR disaster is impossible to meet. If the classifier isn't perfect forever, they're guaranteed to draw outrage.

1. https://twitter.com/jackyalcine/status/615329515909156865

Our expectation of our algorithms are based on human performance. A human would never tag a black person as a gorilla, or vice versa, and if someone did it even once we could pretty safely conclude they're either extraordinarily incompetent, or racist, and in either case we wouldn't trust any tagging done by such a human.
> This was probably the first time this had ever happened. (If it had happened before, it surely would have been spread far and wide by social media & the press.)

That is a very big leap. Social media might be widespread, but almost everything in the world goes unremarked upon. Think of all the news stories that turn up an old tweet or Facebook post that, if anyone had paid attention at the time, would have stopped events from progressing.

There's a difference between a short term hack and real fix. The real solution was for them to train their data with more pictures of black people.
The research fix may have been to train their data better but the blacklist of bad terms was as real of a product fix as it gets.
One of the things that amuses me is trying to find racist/sexist google search results. Here's a few:

I remember a while back Google got flack because the image search for "scientist" was almost entirely famous African American scientists. That's now changed and shows stock images of (mostly white) people in lab coats.

"Three black teenagers" shows mostly groups of mugshots.

The word "Brazilian" shows hot, almost nude women. "German" shows the flag. "Portuguese" shows maps, flags, and a lot of normal looking people. "Hispanic" all pictures are normal looking people.

Seeing images that would be 'racist' or 'sexist' is reflective of you, not the results. For instance if you search for 'white man and white woman' you'll find almost exclusively pictures of interracial couples. Is it some conspiracy to push interracial relations onto people? People of a different bias would say so, and it's equally ridiculous. In reality the simple matter is that Google's search is still extremely primitive and the results are mediocre at best. So you can easily break the search when searching for anything that cannot be trivially mapped to a direct text mapping such as e.g. Justin Bieber or Abraham Lincoln.

For instance search for 'green circle' - okay you get mostly green circles. Now search for 'green circle with red line' and the results are completely nonsensical. The huge leap forward in search engines was being able to avoid returning hardcore porn when searching for Abraham Lincoln. But in spite of tens of thousands of engineers, hundreds of billions of dollars in revenue, and all sorts of fancy declarations of ultra sophisticated AI solving every problem under the sun, we really haven't moved that far beyond that early milestone.

Yeah, I didn't mean to suggest I think the AI/search results are actually racist/sexist. If I really believed that, I wouldn't find it amusing. As you suggest, it's an amusing anecdote which shows how much farther we have to go with regards to getting ML/AI/search right.
'Brazilian' has other meanings you may be unaware of... namely it being the name for a bikini wax. Almost nude women is beyond expected in this case. Just another example of how complex these things are linguistic and cuturally.
I would recommend checking out the google image search for "Brazilian wax" to see what comes up for that. It's not a bunch of hot models in bikinis.
I would recommend looking into why people do Brazilian waxing, and particularly how it relates to bikinis.
Well, to be fair they excluded high tens / low hundreds of potentially offensive terms from search before even launching and when this came out they just extended the list a little. Sometimes having product vision requires recognizing that the products you build come with limitations and potential for very real emotional reactions of very real human being users.
I believe it was gorilla, not monkey and I understand Google for not wanting its product to randomly call people animal names, especially when they are part of a group where it is far too common.