|
|
|
|
|
by zkhalique
3483 days ago
|
|
Yeah, the word bias seems to be overloaded with many meanings. I actually wonder if there is any philosophical difference between the word "knowledge" and the word "bias". But about machine learning... a system can learn from a set of outcomes which are the result of a "biased" system, that is to say tainted with an incorrect Bayesian prior which is not properly corrected, such as let's say courts in a racist society or whatever. It would learn the same bias. Because its goal is to maximize compatibility with the outcomes that the humans did. So it perpetuates those weights. The problem is that we don't know whether the human decisions matched the reality. "Did the person commit the crime" for example. We might have to wait until more unbiased estimators for such activity come along, and throw away old historical data. It's sort of like when Black-Scholes became a self-fulfilling prophecy for valuing derivatives, but only after it became widespread. The market started using Black-Scholes to value derivatives, so it became the best model to predict the value of derivatives. But until then, other models might have fit the historical data better. |
|
If you interpret the word "bias" statistically - rather than the way innumerate reporters trying to sound profound use it - then it's pretty straightforward to do so.
"Knowledge" represents a fixed snapshot of your information about the world.
"Bias" represents a tendency for your knowledge to systematically differ from reality even as more data is gathered.
In programming terms, knowledge is the data in your DB at a fixe dtime. Bias is a tendency for my inserts to fail while yours succeed.
But about machine learning... a system can learn from a set of outcomes which are the result of a "biased" system, that is to say tainted with an incorrect Bayesian prior which is not properly corrected, such as let's say courts in a racist society or whatever. It would learn the same bias. Because its goal is to maximize compatibility with the outcomes that the humans did. So it perpetuates those weights.
If the goal of a system is to predict human behaviors, then it will in fact do that. That's not "bias", that's just building a system with the goal of matching human behavior and getting what you asked for.
However, if the inputs to your algorithm are biased predictors of the output your outputs can still be unbiased. The tendency of machine learning systems is to detect hidden patterns in the data; biased inputs are just another pattern.
I wrote a blog post earlier this year explaining this in detail: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...