| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zkhalique 3483 days ago

Yeah, the word bias seems to be overloaded with many meanings. I actually wonder if there is any philosophical difference between the word "knowledge" and the word "bias".

But about machine learning... a system can learn from a set of outcomes which are the result of a "biased" system, that is to say tainted with an incorrect Bayesian prior which is not properly corrected, such as let's say courts in a racist society or whatever. It would learn the same bias. Because its goal is to maximize compatibility with the outcomes that the humans did. So it perpetuates those weights.

The problem is that we don't know whether the human decisions matched the reality. "Did the person commit the crime" for example. We might have to wait until more unbiased estimators for such activity come along, and throw away old historical data.

It's sort of like when Black-Scholes became a self-fulfilling prophecy for valuing derivatives, but only after it became widespread. The market started using Black-Scholes to value derivatives, so it became the best model to predict the value of derivatives. But until then, other models might have fit the historical data better.

1 comments

yummyfajitas 3483 days ago

Yeah, the word bias seems to be overloaded with many meanings. I actually wonder if there is any philosophical difference between the word "knowledge" and the word "bias".

If you interpret the word "bias" statistically - rather than the way innumerate reporters trying to sound profound use it - then it's pretty straightforward to do so.

"Knowledge" represents a fixed snapshot of your information about the world.

"Bias" represents a tendency for your knowledge to systematically differ from reality even as more data is gathered.

In programming terms, knowledge is the data in your DB at a fixe dtime. Bias is a tendency for my inserts to fail while yours succeed.

If the goal of a system is to predict human behaviors, then it will in fact do that. That's not "bias", that's just building a system with the goal of matching human behavior and getting what you asked for.

However, if the inputs to your algorithm are biased predictors of the output your outputs can still be unbiased. The tendency of machine learning systems is to detect hidden patterns in the data; biased inputs are just another pattern.

I wrote a blog post earlier this year explaining this in detail: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...

link

jerf 3483 days ago

So you know, I've gone ahead and submitted that: https://news.ycombinator.com/item?id=13158883

I do think it's important for people to be more clear about the goals they have for these systems. Do you want them to be accurate regardless of anything else, or do you want them to accomplish particular social goals by deliberately bending the results? There's nothing necessarily wrong with the latter, though you will inevitably find once you're being clear about the latter that even people who are generally ideologically aligned with each other will discover they have different and mutually conflicting goals...

Anyhow, either way, until you have a clear specification about goals you can't determine whether you're accomplishing them. A vague goal of being "nondiscriminatory" doesn't cut it for computers... "nondiscriminatory" is a set of possibilities, not a unique specification. (And if you disagree about my claim it's a set, just try to sit down with one of the aforementioned ideologically-aligned people and try to hammer it out to a coding level of detail.)

link

yummyfajitas 3483 days ago

Not only that, there are impossibility theorems that say you can't meet everyone's definition of "nondiscriminatory" except in a few trivial cases.

https://arxiv.org/pdf/1609.05807v1.pdf

I.e. every algorithm will be "discriminatory" unless you predictor is perfect (i.e. gets 100% of decisions right). It's just a question of which kind of discriminatory they are. The same is true of any human decision process.

Unfortunately this means that we will be subjected to a slew of innumerate reporters posting "XXX is discriminatory" articles, regardless of how decisions are made.

link