| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jskajakzkjx 2029 days ago

> If you had a model trained on a large corpus of data from the pre civil war southern American states, it would have been deeply racist, and would even view black people as possible property. If you had one that was trained on data from the 1950 it would be less racist but still problematic viewed by people from today. Is there really something special with today, that removes these kind of concerns with a model trained with current data?

I think this argument applies not just to machine learning, but to learning in general. Any kind of knowledge-acquisition process is going to be biased by the environment in which it occurs. That goes not just for digital neural networks, but also those in our human brains, operating on the same racist data the ML models are. If that means we shouldn’t do machine learning, it also means we shouldn’t do human learning either.

Of course, the preceding is absurd. A more reasonable take is that we should adjust the objective function of our learning processes to try to account for the effects of biases. We try to do that subjectively as any decent person operating in a biased society should, but our ML models can do it more accurately. In fact, I’d argue that such techniques are necessary to more carefully analyze and build evidence describing the effects of those biases. They can provide insights that will even improve our ability to correct for biases in the real world.