Hacker News new | ask | show | jobs
by fumblebee 1499 days ago
My first thought here is to relate this to the problem of early colour film, which was largely tested and validated with only light skin tones in mind. Once it was put out into the wild, folks with darker skin tones found the product to be total crap. Why? Because there was a glaring OOD (Out of Distribution) problem during testing.

Similarly, if the train/test sets used here - for X-ray based diagnostics - using Machine Learning relies only on specific races, then the performance might be worse for other races, given that there's a new discriminatory variable in play.

The obvious solution here is to reduce bias by ensuring race is part of the dataset used for training and testing. Which, due to PII laws in play, may actually be quite challenging! Fascinating tradeoff imo.