Hacker News new | ask | show | jobs
by foooorsyth 1028 days ago
> Feels like a case of outrage-for-clicks

Like 99% of these “AI discrimination” articles.

>human-detecting AI is developed in a western country with ~60% white population. Most of the training data is collected there

>the AI performed slightly worse in Uttar Pradesh, where the people and everything else in the background look different

>AI is prejudiced! Get outraged!

Every time.

1 comments

Weird, you articulated the exact point of bias in AI but the tone you used is dismissing. Yes obviously AI is not a moral agent and it isn't racist per se. But if it's input is biased and the test is biased then the application will be biased. That's a problem, if you go and deploy these models where their training data is lacking. Let's say, self driving car using the AI you described deployed in Uttar Pradesh is less safe because of bias.

What is wrong with this statement in your opinion?

I'm dismissive of endless streams of unhelpful clickbait articles written by barely-tech-literate journalists aiming to spark racial outrage. I'm not dismissive of the threat of bias in AI, and I'm certainly not a fan of cavalier automotive companies running clearly-alpha autonomous software on public roads.

Still, I find most of these kinds of articles to be obnoxious and unhelpful in their shaming. Did you read the paper linked in the OP? It investigates the following datasets: CityPersons, EuroCityPersons, and BDD100k.

* CityPersons data is from mostly Germany (with all of it coming from central Europe)

* EuroCityPersons, as the name implies, is data from European cities

* BDD100k data is from NYC and Bay Area

So are we shocked with the outcomes here, or are they more or less obvious? If I trained a popular object detector with pedestrian image data solely collected in India, would you be surprised if it performed poorly outside of India? Would that warrant a racially-inflammatory article title?

And, as others have pointed out, these types of investigations make the implicit assumption that autonomous companies aren't working hard to reduce these biases in their own internal training pipelines. As reckless as some of those companies are, I promise you that they are not solely relying on Western data before deploying to non-Western streets. The investigators here may have simply been lazy/biased themselves (only investigating openly-available datasets from Western regions), and then projecting that laziness/bias onto AV companies.