| HN Mirror

If the algorithm is overfitting, it's costing its users money in the general case. Again, self righting. We don't need to hide the data we think it's overfitting on; any modern production ML system shouldn't have trouble with extraneous data. You don't see loan bots denying loans to people because they're named "Phil", for example, even though the bots have that information.

(Good) ML algorithms don't suffer from human biases; they don't know that there's a categorical difference between e.g. race and shoe size, so we don't need to hide race from these algorithms. That is, of course, unless one's explicit goal is to cripple and pessimize the algorithm for political reasons.