Hacker News new | ask | show | jobs
by wyager 3571 days ago
Everyone suggesting that we ought to legislate that machines must be illogical/suboptimal is missing the point.

If machine learning algorithms are unfairly discriminating against some group, then they are making sub-optimal decisions and costing their users money. This is a self-righting problem.

However, a good machine learning algorithm may uncover statistical relationships that people don't like; for example, perhaps some nationalities have higher loan repayment rates. In these cases, the algorithm is not at odds with reality; the angsty humans are. If some people want to force machines to be irrational, they should at least be honest about their motivations and stop pretending it has a thing to do with "fairness".

3 comments

This is a great point. People believe that most groups are basically equal; this is true in the sense that if people were raised in identical environments with equal opportunities than it probably wouldn't really matter what group they were in, but wrong because that isn't the world we live in. Different groups on average experience much different environments. Machine learning doesn't care why the differences in groups arises, but people do. Fundamentally the question is whether we want to base our decisions based on how the world is, or on how we want the world to be.
It comes down to a choice between equality of opportunity versus equality of outcome (or some mix of the two). You can't have both - granting equal opportunities will result in unequal outcomes for all kinds of fair and unfair reasons; and ensuring equal outcomes requires unequal opportunities (e.g. quota systems).

For unfair stereotypes it's simple, you just ignore them; but there will be some group differences that are real - it would be a mighty coincidence if so many so diverse groups would magically happen to be identical in all aspects.

So it's up for the society to decide what to choose what we will do if it turns out that, other observable factors being equal, race/religion/ethnic background/etc X actually is 10% more likely to default on a loan.

I keep making essentially the same point about race/gender discrimination in tech. If group X is as effective as group Y but you can get away with paying them 20% less, why would you NOT hire group X? There's no corporation that's so racist or sexist that it'll turn down saving 20% on payroll.

The issue here isn't that machine learning gives wrong answers, it's that our definition of 'fair' is irrational.

>If group X is as effective as group Y but you can get away with paying them 20% less, why would you NOT hire group X?

Hypothetical possibility: members of group X are not perceived as 100% as effective as group Y because of pervasive bias by the employers that assumes their incompetence. They are generally perceived to be 80% as effective as a standard Y member despite actual 100% performance, and paid accordingly. A member of X needs to be 120% as effective as a Y member to be perceived at 100% Y efficiency because of stereotypes coloring their perception and an inability to objectively evaluate their performance.

Some non-hypothetical studies touching on this:

http://www.nber.org/papers/w9873.pdf http://www.pnas.org/content/109/41/16474.full.pdf+html http://advance.cornell.edu/documents/ImpactofGender.pdf http://www.socialjudgments.com/docs/Uhlmann%20and%20Cohen%20...

Ideally, management would just look at the numbers at some level and figure out if there was some measurable pay disparity they could arbitrage and make money off of. I'm sure some companies have. This is a benefit of impersonal, faceless corporate structures; they don't have human qualities like biologically motivated bias in judgement. On the other hand, they don't have qualities like empathy either, so it's not clear if it's preferable or not.
Possible. But there's still potential problems with that, tying in to the article's main issue of potential feedback loops/bias in ML algorithms. Let's assume pay is correlated to perf/job title, and members of group X are consistently rated 80% of what a member of group Y would earn for identical performance by unintentionally biased managers. Let's assume that they're all similarly 80% as likely to be promoted given identical performance. Anyone looking at the data would find that pay for X and Y members is fair given their perf scores/job titles, and that members of X tend to underperform compared to Y. They could suspect bias in perf from that, or they could conclude that members of X are fairly paid but statistically underperforming. An objective evaluation of a biased/unfair dataset doesn't necessarily guarantee a fair/objective outcome.
I think you're right and obviously making the machine make suboptimal decisions is definitely not a good solution.

However I think a case can be made that certain protected attributes should be censored. Not to prevent the algorithm from making optimal decisions, but to prevent it from overfitting on those attributes. Which, if you think about it, is essentially what discrimination is.

If the algorithm is overfitting, it's costing its users money in the general case. Again, self righting. We don't need to hide the data we think it's overfitting on; any modern production ML system shouldn't have trouble with extraneous data. You don't see loan bots denying loans to people because they're named "Phil", for example, even though the bots have that information.

(Good) ML algorithms don't suffer from human biases; they don't know that there's a categorical difference between e.g. race and shoe size, so we don't need to hide race from these algorithms. That is, of course, unless one's explicit goal is to cripple and pessimize the algorithm for political reasons.