Hacker News new | ask | show | jobs
by blt 2808 days ago
It's wrong even if their model doesn't output a certainty (not all classifiers do). Almost all ML algorithms optimize the expected classification error under the training distribution. So if the training data contains 90% men, it's better to classify those men at 100% accuracy and women at 0% accuracy, than it is to classify both with 89.9% accuracy. Any unsophisticated model will do this.

gp: "The number of women and men in the data set shouldn't matter (algorithms learn that even if there was 1 woman, if she was hired then it will be positive about future woman candidates)."

This is false for typical models.