Hacker News new | ask | show | jobs
by tcmart14 1746 days ago
I don't know if I 100% align up with how you stated it, but yea, its a matter of training data set. I don't think these companies have published their training data set. But thinking back on the issue with asians and facial recognition on Apple's face ID. If they just choose 100 people at random, based off US statistics, 5-6 of those 100 people would have been Asian. And that reflects the 5.7 percent of the population is Asian. And we probably all agree 5-6 people is not a sufficient data set, but picking 100 people at random would be a pretty easy assumption to make for making a data set.

So yea, I think it is an issue with generating a data set and not hitting a sufficient amount of test cases. Because in this instance, asians would be an edge case where creating a small data set to train an algorithm on with a group with a lower representation in the population.

1 comments

I wonder what the datasets of companies like Xiaomi look like. FaceID always worked for me, so it seems like it works for non-asian faces.
Maybe they took more caution to their data set. I think the only way we would know is if they publish their sets or how they built them. But I was just highlighting maybe one possible case that Apple could have generated their training set, just grab 100 people in America at random.