Hacker News new | ask | show | jobs
by mjburgess 1682 days ago
The issue is that the machine has no causal model of how the predictions are leading to the data. You could, as you say, try to single-out some variables and rig how they're processed.

Here I dont think that helps: so long as you predict area A has more crime, area A is more policed and thus always appears to have more crime.

The issue, in my view, is not the data nor the algorithm -- no modification to either can fix the issue. The issue is the machine isn't embedded in the world, and esp. has no ability to acquire a rich understanding of the human social environment.

A fundamental aspect of understanding X is knowing what is irrelevant to X, ie., what "data" it is permissible/essential to ignore. In the case of crime, one ought ignore data from over-policing -- but this is not an effect which is present in the data itself.