| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by baxter001 1641 days ago

Completely not the focus of the article, and you've turned the result of an error rate of 0.8 percent for gender classification of light-skinned men and a 34.7 percent error rate for the same classifier on dark-skinned women - into some kind of google image search language game?

I can only quote Joy Buolamwini on this:

“To fail on one in three, in a commercial system, on something that’s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?”

1 comments

b9a2cab5 1641 days ago

The answer would probably be yes if that subgroup wasn't a large percentage of the dataset used for training and testing. Or if that subgroup wasn't a large percentage of the user base.

Come on, if you've worked at any large company using ML you know model performance is literally just taking the average accuracy/ROC/precision/etc over your training dataset plus some hold out sets. Then you track proxy metrics like engagement to see if your model actually works in production. At no point does race come into the equation. Naturally, if your choice of subgroup happens to not be a large proportion of either the dataset or the userbase then you don't see the poor performance on that subgroup show up in your metrics so you don't care to fix it.

link

indigo945 1641 days ago

Obviously, but the question is, why were there no Black women in the data set, and what care can be taken to prevent racialized bias when selecting the data set in the future?

link

Blikkentrekker 1641 days ago

I would assume these data sets are not manually selected but imported from some mechanism.

Other issues which are sure to arise is that the a.i. will have trouble with people who aren't smiling, and that the data set probably contains people who look better than average, and almost certainly excludes people who suffer from injuries or deformities in appropriate proportions.

Perhaps an interesting project is simply the compilation of a vast dataset of “world proportional pictures of people”. — It would be an interesting undertaking to realize such a dataset.

link

tsimionescu 1641 days ago

World proportional is not good enough for this type of task. If we are to rely on AI for things like identifying people in pictures in a trial, we would need equal representation in the data set, so the AI doesn't have any kind of systematic bias. Otherwise, the AI's bias will compound errors in the real world. So you would need as many pictures of Australian aborigenees in the data set as Han Chinese people if you wanted to be sure there isn't a risk that a random person would be confused for someone of the over or under represented groups.

link

b9a2cab5 1641 days ago

Certainly you can ask these questions but these are business process issues, not technical ones. They're unrelated to AI.

My personal take is you won't see any tangible movement on this until black women (or whatever group you choose) comprise a tangible proportion of revenue generating users. Corporations operate for money and nothing else.

link

notahacker 1641 days ago

Of course they are related to what we call AI, because what we call AI is primarily dependent on the quality of the business processes behind data selection and testing. If there is a strong trend of business processes to create systematic errors in the results the technology generates (an AI trained in China sucking at recognising white people wouldn't be a counter example of this phenomenon, it would be the same issue) it's an underlying weakness of the technology, and the utility of the technology needs to be viewed in the context that it's likely compromised by biases in the business processes of its developers.

Black women or other groups not viewed as the mainstream target for an AI solution aren't going to form a tangible proportion of revenue generating users if the software doesn't function properly for them. And a lot of the use cases for AI analysis don't involve the unrepresented-in-corpus minority group being the consumer anyway, they involve it being used to screen them by a third party who's been sold the tool on the false premise that it's free from human bias.

link