| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by belorn 2610 days ago

> Often these errors stem from systematic biases in our society

No, this does also not match.

One of the easiest way to get a ML model that creates systematic errors is spam filters. If I take my spam folder with no consideration, what the filter will learn is that any language which isn't my own are spam, and that servers located outside my nation are spammers. This resembles prejudice.

The cause of this systematic error is that individual email addresses do not get ham emails uniformly from every nation and every language. Proximity warps the data. I would need to normalize the data based on language and nation if I wanted to remove those errors in the filter. Looking at it from a political perspective does not make the filter perform better, and fixing it from that side has a high risk of causing even more errors in the model.