|
|
|
|
|
by srean
3549 days ago
|
|
Are you saying that it can form a good estimate of the conditional probability ? I can believe that if the sampling process preserves the conditional. Otherwise one would have to make assumptions about (or in other words, model) the corruption process. The bias compensation machinery then has to be deliberate, wont happen on its own. Some sampling processes do not modify the conditional. In those cases no special machinery would be required. |
|
But if your model is sufficiently expressive you don't need to explicitly build or model the corruption process. In the example in my linked blog post, test scores might be biased against blacks. But race is also redundantly encoded, so the algorithm has enough information to fix the bias completely by accident.
Fundamentally what I'm saying here is that bias is a statistics problem and has a statistics solution. Insofar as your complaint is algorithms finding the wrong answer, the solution is better stats.
And nothing whatsoever that I've said here would be remotely controversial if the topic were remote sensing.