|
|
|
|
|
by baxter001
1641 days ago
|
|
Completely not the focus of the article, and you've turned the result of an error rate of 0.8 percent for gender classification of light-skinned men and a 34.7 percent error rate for the same classifier on dark-skinned women - into some kind of google image search language game? I can only quote Joy Buolamwini on this: “To fail on one in three, in a commercial system, on something that’s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?” |
|
Come on, if you've worked at any large company using ML you know model performance is literally just taking the average accuracy/ROC/precision/etc over your training dataset plus some hold out sets. Then you track proxy metrics like engagement to see if your model actually works in production. At no point does race come into the equation. Naturally, if your choice of subgroup happens to not be a large proportion of either the dataset or the userbase then you don't see the poor performance on that subgroup show up in your metrics so you don't care to fix it.