Hacker News new | ask | show | jobs
by johnp271 1114 days ago
This artist's several sentence summary of an ANN and relating it to prejudice is fascinating: "The output of an artificial neural network can be roughly defined as a conclusion obtained by generalising a limited set of observations. Surprisingly prejudice can be defined in the same way. This will always be a problem with systems that generalize information. No matter how large and representative a dataset might be there will always be an eccentric outlier that will break the system." On the one hand this succinctly sums up the challenges we face with AI systems becoming more and more ubiquitous and on the other hand the reality we non-artifical intellegent humans face in living our lives and dealing with day to day encounters.
4 comments

"The output of an artificial neural network can be roughly defined as a conclusion obtained by generalising a limited set of observations. Surprisingly prejudice can be defined in the same way."

Not really surprising. The first thing they teach in data science is that bias is everywhere. One of the first things taught in programming is garbage in garbage out and that computers do exactly what we tell them. Once you start making decisions with biased data you will start to prejudice some group.

The quest for non-biases systems is a little like a perpetual motion machine. If we all have biases and these machines learn from the same data we do, using systems we write, how could one expect a different outcome?

The

This is why you strive to identify the biases and move them from system 1 to system 2 thinking. AI will help humanity operate in a direct cognitive regime rather than sub-conscious one.
Not all or even most subconscious influence on decision making is bad. There is plenty we can’t yet quantify and fully understand.
There is a classic thereom from computational learning theory that says, if all hypotheses are equally likely, then no generalization can happen. Ie bias is necessary for learning.

To respond to some sibling comments: Yup, this is prejudice. I'll try to analogize the thereom with an example: Without prejudice, you can't recognize a leaf in a figure, because alternate hypotheses (there are an arbitrary number of things in this universe that look like leaves but in fact are not) are equally likely.

My advisor one told me that machine learning is the study of biases.

"Without the aid of prejudice and custom, I should not be able to find my way across the room." - William Hazlitt

Isn't this more a study of priors and statistics than bias? Bias would be an error between some latent underlying value and an estimate of it.
This labels all imperfection in reasoning "prejudice".

Seems like a biased premise.

Definition 1 of prejudice in a lazy google search is "preconceived opinion that is not based on reason or actual experience." Certainly that figures into most reasoning, considering that perfect information is impossible.
That definition seems to cover everything you've learned a textbook, video, lecture, another person, or in any other indirect way, and which you didn't have an opportunity to think through yet.

Which is... most of the thing people know? Including, ironically, this very definition, which I learned about from a HN comment that quoted a Google search result...

I guess it depends on what we define reason and reasoning as. Are the rules a “reason” even if not “reasoning”?
It doesn't label. It doesn't do anything to "all" of anything. It doesn't refer to any "imperfection," and it doesn't address "reasoning."

This is a misrepresentation of the parent comment and the article.

It's interesting. Everybody is always talking about creating unbiased machine learning models, but we're still no closer to cracking the code on unbiased humans.
In the data sense isn't bias literally just the result of limited/narrow data? So isn't the problem not in how you train models but simply the fact that it's impossible/exceeding difficult to provide omnipotent and universal data?
Bias of a data set is when it doesn't reflect the true underlying distribution of nature.

So a face corpus with only white faces doesn't reflect the diversity of faces one encounters in the world.

With that said, unbiasing data is extremely difficult because the true distribution of things is unknown and sometimes subjective. The visual images you would encounter as a human from birth to death growing up in a first world country would be very different from that of a drone's video camera. Are we really sure that imagenet should be K% animals and not K/2% animals? And if you train a machine learning algorithm on every possible image with every possible pixel, it will just learn noise.

I'm not biased. It's everyone else that is.