Hacker News new | ask | show | jobs
by DuskStar 2608 days ago
One situation I could see leading to this result (Amazon cancelling their resume filtering software with the excuse that it 'skewed male') is that

1. The AI system accurately predicted employee success across both genders

AND

2. The AI system predicted that women would do worse than men

That's politically embarrassing and something that you can't necessarily 'fix' by improving the system. (see: all the 'will this person commit a crime if let out on parole' systems that end up accurately discriminating based on race)

This isn't to say that women are worse engineers than men, or anything of that sort - only that the applicant pool to Amazon was skewed, or women were treated worse in the workplace and thus performed worse, or a dozen other possible causes. (And only in this hypothetical scenario! I have no inside info from Amazon!)

4 comments

Your example is quite possible, particularly at an organazation that would be embarrased by such a result.

Assume that the ability curve of male applicants and female applicants are identical; that the majority of applicants are male; and that Amazon wants to hire more females then would be expected given the portion of applicants that are female.

A natural way of accomplishing this goal is to give extra points to female applicants [0].

Due to selection bias, the ability curve of women within the population of Amazon engineers would skew lower then men within the population of Amazon engineers.

This is a special case of a more general phenomona. If you have signal S that is positivly correlated with a desired trait in the general population, and over select for S, you will find that S is negativly correlated within your population.

[0]. All proposals I have seen amount to either a good approximation of this or changing the applicant pool. And, by assumption, the latter is excluded.

In this case, it appears to instead be a matter of journalists focusing on totally the wrong aspect of a story for more drama. Buried deep in the original Reuters piece is this offhand mention:

> Gender bias was not the only issue. Problems with the data that underpinned the models’ judgments meant that unqualified candidates were often recommended for all manner of jobs, the people said. With the technology returning results almost at random, Amazon shut down the project, they said.

Apparently the recommendation system really did create gender bias, neither inherited from real differences nor from replicated human biases. (It looks like an issue with mismatched training data and task.) But that initial bias was found and corrected (2015) more than a year before the project was cancelled (2017) for providing "random" results. I think this is the most extreme case of algorithmic bias I've ever seen, but also the least commonly relevant; Amazon appears to have built a model which contained almost no rules except sexism, and scrapped it for not knowing anything worthwhile.

https://www.reuters.com/article/us-amazon-com-jobs-automatio...

That is certainly another plausible explanation - and a less culture-war infused one, too. Thanks!
This is feels like an elephant in the room when it comes to AI bias. We develop an AI that accurately predicts outcomes and discover it is biased, then instead of asking if maybe this means our current system is deeply biased and needs to be changed, we say, "don't use the AI; keep using the people who might or might not be biased but we don't know because we can't measure it in the way an AI can be measured."

If it isn't acceptable to use an AI to create biased outcomes how is it acceptable to use people to create the the same outcomes. AI decision making can be examined and tuned in ways that people cannot.

The problem is that AI and more generally 'algorithms' are or were presented as neutral and unbiased. As such their biased results prop up a biased system.

I don't think people are against using ML and for biased human systems. Just pointing out the ignorant, naive and lazy deference to computers that often occurs in human systems that share the same bias.

In short I'd think most people who are against biased AI are also against biased human systems for very similar reasons.

Of course, sometimes reality is also biased, and the AI systems are just accurately reflecting reality. And that's an even bigger elephant.
I’m not sure what that even means if we know we can bias outcomes. Pretending there is some kind of natural state that is for the sake of being natural preferred seems odd given humans propensity to change the world to suit. I also suspect for many that ‘reality’ is really just a dog whistle for their preferred biases. Not to mention the entire issue with deriving and ought from an is.
Suppose you train an AI to predict how good people are at weight lifting, trained from a bunch of seemingly unrelated data (maybe you want to hire bouncers or construction workers). You will find that the model predicts better performance for males. You notice this, identify that men are more likely to go to the gym than wimen, and modify your data to compensate for this. But when you rerun the model men still show better results. You find some other biases in your data. You find societal biases, like role models for girls not being physically strong. You even take some women and show that with training they outperform average men.

You can modify reality, but our understanding of biology - especially hormones - clearly tells us that the AI was right: men are generally better than women at weight lifting.

I'm not saying that every issue is like that, but it would be foolish to ignore that sometimes reality is biased, sometimes in obvious ways and sometimes more subtly.

What I was getting at is that our important choices are about outcomes and those have nothing really to do with assumptions about reality. For example all should be equal before the law. A statement that is supposed to be true but very obviously isn’t.

Your post is great for the assumptions it encodes. Like what does it mean to be good at weight lifting? And that for some reason being good at weight lifting is a good proxy for being a good bouncer or construction worker?

For an off the cuff example it’s a great way to demonstrate the sort of bias we can naively introduce then defend because it’s just ‘reality’. When really it’s much more complex than identifying a relevant trait and assuming everything else falls out of it.

One major problem.

The parole software was NOT being fed data for "will this person commit another crime". It was being fed data for, "will this person be a suspect for another crime".

The significant difference is that selective enforcement biases the data that it was trained on. Said selective enforcement has multiple causes, including the fact that heavier patrolling in black neighborhoods makes catching crimes more likely.

The size of the selective enforcement bias shows in a number of ways. For example consider drugs. In surveys, the usage of illegal drugs is the same in blacks and whites. And yet 6 times as many blacks are arrested for using illegal drugs as whites.

Which represents ground truth better? arrest records, or survey results?
For this? Probably survey results. Particularly https://nsduhweb.rti.org/respweb/homepage.cfm.