|
|
|
|
|
by yummyfajitas
3549 days ago
|
|
tOne approach is to directly model the corruption process. Being the model-based-Bayesian guy I am, this is something I like to do. But if your model is sufficiently expressive you don't need to explicitly build or model the corruption process. In the example in my linked blog post, test scores might be biased against blacks. But race is also redundantly encoded, so the algorithm has enough information to fix the bias completely by accident. Fundamentally what I'm saying here is that bias is a statistics problem and has a statistics solution. Insofar as your complaint is algorithms finding the wrong answer, the solution is better stats. And nothing whatsoever that I've said here would be remotely controversial if the topic were remote sensing. |
|
This is the claim that I am having trouble with.
Say I have two random variable X,Y with some joint distribution. If a corruption process can mess with the samples drawn from it, I cannot see how it could possibly recover either the joint or the conditional.
Are you saying that the corruption is benign like missing at random or missing completely at random ? Then its much more believable.