| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fuscy 2808 days ago

The eye opening thing here is not that the AI failed, but why it failed.

At start the AI is like a baby, it doesn't know anything or have any opinions. By teaching it using a set of data, in this case a set of resumes and the outcome then it can form an opinion.

The AI becoming biased tells that the "teacher" was biased also. So actually Amazon's recruiting process seems to be a mess with the technical skills on the resume amounting to zilch, gender and the aggressiveness of the resume's language being the most important (because that's how the human recruiters actually hired people when someone put a resume).

The number of women and men in the data set shouldn't matter (algorithms learn that even if there was 1 woman, if she was hired then it will be positive about future woman candidates). What matters is the rejection rate which it learned from the data.. The hiring process is inherently biased against women.

Technically one could say that the AI was successful because it emulated the current Amazon hiring status.

12 comments

lalaland1125 2808 days ago

> The number of women and men in the data set shouldn't matter (algorithms learn that even if there was 1 woman, if she was hired then it will be positive about future woman candidates).

This is incorrect. The key thing to keep in mind is that they are not just predicting who is a good candidate, they are also ranking by the certainty of their prediction.

Lower numbers of female candidates could plausibly lead to lower certainty for the prediction model as it would have less data on those people. I've never trained a model on resumes, but I definitely often see this "lower certainty on minorites" thing for models I do train.

The lower certainty would in turn lead to lower rankings for women even without any bias in the data.

Now, I'm not saying that Amazon's data isn't biased. I would not be surprised if it were. I'm just saying we should be careful in understanding what is evidence of bias and what is not.

blt 2808 days ago

It's wrong even if their model doesn't output a certainty (not all classifiers do). Almost all ML algorithms optimize the expected classification error under the training distribution. So if the training data contains 90% men, it's better to classify those men at 100% accuracy and women at 0% accuracy, than it is to classify both with 89.9% accuracy. Any unsophisticated model will do this.

gp: "The number of women and men in the data set shouldn't matter (algorithms learn that even if there was 1 woman, if she was hired then it will be positive about future woman candidates)."

This is false for typical models.

screye 2808 days ago

> The lower certainty would in turn lead to lower rankings for women even without any bias in the data.

This is not true.

Probabilistic-ly speaking, if we are computing P(hiring | gender); Lower certainty means there is a high variance in prior over women. However, over a large dataset, the "score" would almost certainly be equal to the mean of the distribution, and be independent of the variance.

In simpler words, if there was a frequency diagram of scores for each gender (most likely bell curves), then only the peak of the bell curve would matter. The flatness / thinness of the curve would be completely irrelevant to the final score. The peak is the mean, and the flatness is the uncertainty. Only the mean matters.

titzer 2808 days ago

There's not enough information about how their ML algorithm works, nor how large their dataset was for any of the above reasoning to be justified. Fwiw, many ranking functions do indeed take certainty into account, penalizing populations with few data points.

ewjordan 2808 days ago

If they were using any sort of neural networks approach with stochastic gradient descent, the network would have to spend some "gradient juice" to cut a divot that recognizes and penalizes women's colleges and the like. It wouldn't do this just because there were fewer women in the batches, rather it would just not assign any weight to those factors.

Unless they presented lots of unqualified resumes of people not in tech as part of the training, which seems like something someone might think reasonable. Then, the model would (correctly) determine that very few people coming from women's colleges are CS majors, and penalize them. However, I'd still expect a well built model to adjust so that if someone was a CS major, it would adjust accordingly and get rid of any default penalty for being at a particular college.

If the whole thing was hand-engineered, then of course all bets are off. It's hard to deal well with unbalanced classes, and as you mentioned, without knowing what their data looks like we can only speculate on what really happened.

But I will say this: this is not a general failure of ML, these sorts of problems can be avoided if you know what you're doing, unless your data is garbage.

lalaland1125 2808 days ago

> It wouldn't do this just because there were fewer women in the batches, rather it would just not assign any weight to those factors.

That's exactly the issue we are talking about here. Woman's colleges would have less training data so they would get updated less. For many classes of models (such as neural networks with weight decay or common initialization schemes) this would encourage the model to be more "neutral" about women and assign predictions closer to 0.5 for them. This might not affect the overall accuracy for women (as it might not influence whether or not they go above or below 0.5), but it would cause the predictions for women to be less confident and thus have a lower ranking (closer to the middle of the pack as opposed to the top).

ewjordan 2808 days ago

I don't think I'm with you. A neural net cannot do this - picking apart male and female tokens requires a signal in the gradients that force the two classes apart. If there's no gradient, then something like weight decay will just zero out the weights for the "gender" feature, even if it's there to begin with. Confidence wouldn't enter in, because the feature is irrelevant to the loss function.

A class imbalance doesn't change that: if there's no gradient to follow, then the class in question will be strictly ignored unless you've somehow forced the model to pay attention to it in the architecture (which is possible, but would take some specific effort).

What I'm suggesting is that it's likely that they did (perhaps accidentally?) let a loss gradient between the classes slip into their data, because they had a whole bunch of female resumes that were from people not in tech. That would explain the difference, whereas at least with NNs, simply having imbalanced classes would not.

bcheung 2808 days ago

How did you control for these things? Wondering what patterns there are that people use to prevent social discrimination.

Seems challenging since much of AI, especially classification, is essentially a discrimination algorithm.

was_boring 2808 days ago

There are a few ways you can tackle this issue: 1) have the same algorithm for each group, but train separately (so in the end you have two different weights); 2) over-sample the group under represented in the data; 3) make the penalty more severe for guessing wrongly on female then male applicants during training; 4) apply weights to gender encoding; 5) use more then just resumes as data.

This isn't an insurmountable problem, but does require extra work then just "encode, throw it in and see what happens".

Amazon only scrapped the original team, but formed a new one in which diversity is a goal for the output.

gsich 2808 days ago

Or: don't include gender in the training data.

kareemsabri 2808 days ago

They didn’t. It was discovered through other signals (mention of membership in “women’s” clubs etc.

gsich 2808 days ago

So they did. It should be obvious that if you don't want to include gender, then you have to sanitize gender-related data.

tomp 2808 days ago

> The lower certainty would in turn lead to lower rankings for women even without any bias in the data.

I don't think that's true. "No bias" means that gender is irrelevant (i.e. its correlation with outcome is 0%). Therefore the system shouldn't even take it into account - it would evaluate both men and women just by other criteria (experience, technical skills, etc), and it would have equal amounts of data for both (because it wouldn't even see them as different).

You need bias to even separate the dataset into distinct categories.

theptip 2808 days ago

> "No bias" means that gender is irrelevant

False. If we're talking about the technical statistical definition, bias means systematic deviation from the underlying truth in the data -- see this article by Chris Stucchio with some images for clarification:

https://jacobitemag.com/2017/08/29/a-i-bias-doesnt-mean-what...

"In statistics, a “bias” is defined as a statistical predictor which makes errors that all have the same direction. A separate term — “variance” — is used to describe errors without any particular direction.

It’s important to distinguish bias (making errors with a common direction) from variance which is simply inaccuracy with no particular direction."

tomp 2808 days ago

I think the comments I replied to mean bias as in “sexist bias”.

grandmczeb 2808 days ago

Bias as in racism, sexism, etc, has multiple definitions, some of which are mutually exclusive.

theptip 2807 days ago

Well, it was clear that _you_ think so.

My point was that you should consider the meaning of the word under which the post you're replying to is correct, especially given that the author was claiming specific domain experience.

tomp 2807 days ago

The original was:

> The lower certainty would in turn lead to lower rankings for women even without any bias in the data.

your post said:

> If we're talking about the technical statistical definition, bias means systematic deviation from the underlying truth in the data

So I think my interpretation is correct, even though it's not "the technically statistically correct usage". You were referring to the bias of the algorithm (i.e. the mean divergence from the mean in the data), whereas we were referring to the "hiring bias" evident in the data. In fact, your "bias" was mentioned as "lower rankings for women" - i.e. "the algorithm would have (statistical) bias even without (sexist) bias in the data" and I was replying that I think that's false.

chiefalchemist 2808 days ago

Question: So technically, the AI is not bias against women per se, but a set of characteristics / properties, that are more common among women.

I'm not trying to split hairs (or argue), as much as further clarify the difference between (the common definition of) human bias and that of statistical bias.

zaarn 2807 days ago

Correct.

Computers are very bad at actually discriminating against people, they will pick up a possible bias in a statistical dataset (ie, <protected class> uses certain sentence structure and is statistically less likely to get or keep the job).

Sometimes computers also pick up on statistical truths that we don't like, ie, you assign a ML to classify how likely someone is to pay back their loan and it picks up on poor people and bad neighborhoods, disproportionately affecting people of color or low income households. In theory there is nothing wrong with the data, after all, these are the people who are least likely to pay back a loan, but our moral framework usually classifies this as bad and discriminatory.

Machine Learning (AI) doesn't have moral frameworks and doesn't know what the truth is. The answers it can give us may not be answers we like or want or should have.

on a side note; human bias is usually not that different since the brain can be simplified as a bayesian filter; there are predictions on the present based on past experience, reevaluation of past experience based on current experience and prediction of future experience based on past and current experience. It's a simplification but usually most human bias is based on one of these, either explicitly social (bad experience with certain classes of people) or implicitly (tribalism).

theptip 2807 days ago

> the brain can be simplified as a bayesian filter

I agree with everything else in your post, but just wanted to note that while this is true to some extent, the brain is much less rational than a pure Bayesian inference system; there are a lot of baked in heuristics designed to short-circuit the collection of data that would be required to make high-quality Bayesian inferences.

This is why excessive stereotyping and tribalism are a fundamental human trait; a pure Bayesian system wouldn't jump to conclusions as quickly as humans do, nor would it refuse to change its mind from those hastily-formed opinions.

theptip 2807 days ago

> the AI is not bias against women per se

I think I'd make the claim a bit less strongly -- we don't know if there is statistical bias or non-statistical/"gender bias" in the data; both are possible based on what we know.

However exploring the statistical bias possibility, the simple way this could happen is if the data have properties like:

1. For whatever reason, fewer women than men choose to be software engineers 2. For whatever reason, the women that choose to be software engineers are better at it than men

(Note I'm just using hypotheticals here, I'm not making claims about the truth of these, or whether it's gender bias that they are true/false).

Depending on how you've set up your classifier, you could effectively be asking "does this candidate look like software engineers I've already hired"? If so, under the first case, you'd correctly answer "not much". Or you could easily go the other way and "bias" towards women if you fit your model to the top 1% where women are better than men, in our hypothetical dataset.

This would result in "gender bias" in the results, but there's no statistical bias here, since your algorithm is correctly answering the question you asked. It's probably the wrong question though!

Figuring out if/when you're asking the right question is quite difficult, and as the sibling comment rightly pointed out, sometimes (e.g. insurance pricing) the strictly "correct" result (from a business/financial point of view) ends up being considered discriminatory under the moral lens.

This is why we can't just wash our hands of these problems and let a machine do it; until we're comfortable that machines understand our morality, they will do that part wrong.

gambler 2808 days ago

The article didn't specify how they labeled resumes for training. You're assuming that it was based on whether or not the candidate was hire. Nobody with an iota of experience in machine learning would do something like that. (For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.)

A far more reasonable way would be to take resumes of people who were hired and train the model based on their performance. For example, you could rate resumes of people who promptly quit or got fired as less attractive than resumes of people who stayed with the company for a long time. You could also factor in performance reviews.

It is entirely possible that such model would search for people who aren't usually preferred. E.g. if your recruiters are biased against Ph.D.'s, but you have some Ph.D.'s and they're highly productive, the algorithm could pick this up and rate Ph.D. resumes higher.

Now, you still wouldn't know anything about people whom you didn't hire. This means there is some possibility your employees are not representative of general population and your model would be biased because of that.

Let's say your recruiters are biased against Ph.D.'s and so they undergo extra scrutiny. You only hire candidates with a doctoral degree if they are amazing. This means within your company a doctoral degree is a good predictor of success, but in the world at large it could be a bad criteria to use.

noetic_techy 2808 days ago

I'm not a ML guy, but reading this, it almost sounds like the training data needs to be a fictional, idealized set, and not based on real world data that already has bias slants built in. Possibly composites of real world candidates with idealized characteristics and fictional career trajectories. Basically, what-my-company-looks-like vs what-I-want-it-to-look-like. I'm not sure this is even possible.

Its an interesting questions. On one hand, a practical person could argue: "Well, this is what my company looks like, and these are the types of people who fit with our culture and make it, so be it. Find me these types of candidates."

VS

"I don't like the way may company culture looks, I would rather it was more diverse. This mono-culture is potentially leaving money on the table from not being diverse enough. I'm going to take my current employees, chart their career path, composite them (maybe), tweak some of the ugly race and gender stats for those who were promoted, and feed this to my hiring algorithm."

ergothus 2808 days ago

> the training data needs to be a fictional, idealized set, and not based on real world data that already has bias slants built in

Thatd be great, but in this case (as in most ML cases) the idea is not "follow this known, tedious process" but instead "we have inputs and results but dont know the rules that connect them, can you figure out the rules?"

> this is what my company looks like

In tech hiring, no one wants the team they have...they want more people but without regrets (including regretting the cost)

ewjordan 2808 days ago

> You're assuming that it was based on whether or not the candidate was hire. Nobody with an iota of experience in machine learning would do something like that. (For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.)

It's a fine strategy if all you're trying to do is cost-cut and replace the people that currently make these decisions (without changing the decisions).

I agree that most people with ML experience would want to do better, and could think of ways to do so with the right data, but if all the data that's available is "resume + hire/no-hire", then this might be the best they could do (or at least the limit of their assignment).

b_tterc_p 2808 days ago

A reasonable assumption but, in practice, false. Many companies believe (perhaps correctly) that their hiring system is good. Using hiring outcomes would be a reasonable dependent variable, especially if supply is lower than demand, performance is difficult to measure, or there’s a huge surplus of applications which need to be cut down to a smaller number of human assessed resumes.

jonny_eh 2808 days ago

Men are promoted quicker, and more often, than women.

deegles 2808 days ago

There was a company meeting one year at Amazon when they proudly announced that men and women were paid within 1-2% of each other for the same roles. It completely missed the point which you raise.

I want to see reports of average tenure and time between promotions by gender. I suspect that the reason we don't see those published is that the numbers are damning.

zaarn 2807 days ago

Or possibly noone did a study of sufficient size that passed peer review.

It's also not hard to make the pay gap 1-2% just like it's not hard to make it 25% (both values are valid). Statistics is a fun field. Don't trust statistics you didn't fake yourself.

Amazon could easily cook the numbers to get to 1-2%, I doubt anyone checked the process of determining that number if it's unbiased and fair and accounts for other factors or not.

gambler 2808 days ago

I didn't write anything about promotions. I mentioned tenure and performance reviews.

If you had a way to accurately predict that some company would systematically donwrate you and eventually fire you or force you to quit, would you want to interview there? If you were a recruiter in that company and could accurately predict the same, would it be ethical for you to hire the candidate anyway?

This is not to say that I approve of blindly trusting AI to filter candidates, but the overall issue isn't nearly as simple as many comments here make it out to be.

wetpaws 2808 days ago

Does it corelate with performance?

beat 2808 days ago

And how is performance measured?

Aggressive behavior is considered admirable in men, and deplorable in women. Many women I know have noted comments in their performance reviews about their behavior - various words that can all be distilled to "bitchy".

gspetr 2808 days ago

And then you take your experience, connections and expertise to leave and start your own company where none of this happens.

But is that what we see in real life?

I don't have data or sources at hand, but I'd bet top dollar that F-M ratio among employees is much more lopsided in male favor among founders[0].

[0] Not using the word CEO, because that can be appointed for somewhat arbitrary reasons.

fizwhiz 2808 days ago

citation needed

fizwhiz 2808 days ago

downvoters, please explain. The statement makes sense when you look at it in tech where there are more men than women. So it may appear that more men are getting promoted compared to their women counterparts. But that doesn't mean men >>> women, it's just statistics at play.

asaph 2808 days ago

> For obvious reasons: you can't tell from your data whether people you did not hire were truly bad.

Many companies are fine with false negatives in their hiring process. Better to pass on a good candidate than hire a bad one.

femidav 2806 days ago

This also means that if you hire unqualified women only because they are women, then your AI will have bias against women.

brown9-2 2808 days ago

This seems to assume that performance evaluation is itself free from bias.

kareemsabri 2808 days ago

This doesn’t seem to be a reasonable conclusion. There is no reason to assume the AI’s assessment methods will mirror those of the recruiters. If Amazon did most of it’s hiring when programming was a task primarily performed by men, and so Amazon didn’t receive many female applicants, they could be unbiased while still amassing a data set that skewed heavily male. The machine would then just correctly assess that female resumes don’t match, as closely, the resumes of successful past candidates. Perhaps I’m ignorant about AI, but I don’t see why the number of candidates of each gender shouldn’t increase the strength of the signal. “Aggressiveness” in the resume may be correlated but not causal. If the AI was fed the heights of the candidates, it might reject women for being too short, but that would not indicate height is a criteria of Amazon recruiters hiring.

danShumway 2808 days ago

This is a subtle point but worth stating -- AI does not mirror or copy human reasoning.

AI is designed to get the same results as a human. How it gets to those results is often very, very different. I'm having trouble finding it, but there was an article a while back trying to do focus tracking between humans and computers for image recognition. What they found was that even when computers were relatively consistent with humans in results, they often focused on different parts of the image and relied on different correlations.

That doesn't mean that Amazon isn't biased. I mean, let's be honest, it probably is; there's no way a company this large is going to be able to perfectly filter or train every employee and on average tech bias trends against women. BUT, the point is that even if Amazon were to completely eliminate bias from every single hiring decision it used in its training data, an AI still might introduce a racial or gendered bias on its own if the data were skewed or had an unseen correlation that researchers didn't intend.

kaitai 2808 days ago

The whole aim of the AI was to make decisions like the recruiters did -- that is explicitly what they were aiming to do. It might be worth reading the article as it addresses your two ideas (the aim of the project and the fact that the training set was indeed heavily male).

kareemsabri 2808 days ago

Hey. I did read the article. It doesn’t support the conclusion OP is drawing. The aim of the AI is to “mechanize the search for talent”. It doesn’t care to, nor have any means to, make decisions “like the recruiters did”. Obviously machines don’t make decisions like humans do. They’re trying to reverse engineer an alternate decisions making process from the previous outcomes.

mbesto 2808 days ago

> The aim of the AI is to “mechanize the search for talent”. It doesn’t care to, nor have any means to, make decisions “like the recruiters did”.

This is why AI is so confusing. All "AI" does is rapidly accelerate human decisions by not involving them, so that speed and consistency are guaranteed. They are not replacements for human decision making, they are replacements for human decision making at scale.

If we can't figure out how to do unbiased interviews at the individual level, then AI will never solve this problem. Anyone that tells you otherwise is selling you snake oil.

chosenbreed 2808 days ago

> If we can't figure out how to do unbiased interviews at the individual level, then AI will never solve this problem. Anyone that tells you otherwise is selling you snake oil.

I wonder to what extent people want to solve it and perhaps more importantly whether or not it can be solved at all...

beat 2808 days ago

This is all happening before the interview, even. The AI, as far as I can see from the article, was just sorting resumes into accept/reject piles, based on the kinds of resumes that led to hire/pass results in the hands of humans.

r00fus 2808 days ago

So the recruiters may or may not have been biased, but if the previous outcomes were (based on the candidate pool) then the AI is sure to have been "taught" that bias.

Unless Amazon is willing to accept a) another pool of data or b) that the data will yield bias and apply a correction, the AI is almost guaranteed to be taught the bias.

kareemsabri 2808 days ago

Yep, I agree a skewed dataset is not good for the task of correcting an unequal distribution and is likely to maintain or even increase it.

erikpukinskis 2808 days ago

Aren't the "previous outcomes" past hiring decisions though?

kareemsabri 2808 days ago

Yes, but you have to know what pool you started with. As an overly simplistic example, if a bank used historical mortgage approval records from primarily German neighbourhoods to train AI, it might become racist against non-Germans despite that it’s just an artifact of the demographics of the time. I think it just shows how not ready for prime time AI is.

BurningFrog 2808 days ago

Control question for if you're making a certain intellectual mistake.

The data set will also have skewed heavily against people named "David". Probably only ~1% of the successful applicants.

Would you also expect the machine to be biased against candidates named David?

astrodust 2808 days ago

What if people named David got hired 10/100 times in the past but people named Denise only got hired 6/100 times?

Hiring practices as expressed in the data get picked up by the machine and applied accordingly. As such, David is predicted to be a better hire than Denise.

This is not about "David" vs. "Denise", but how the machine learning process will aggregate and classify names. David and David-like names will come out on top while obscure names it has no idea how to deal with (0/0 historically) will probably be given no weighting at all.

Sorry "Daud!" Our algorithm says David is better.

kareemsabri 2808 days ago

I would expect the AI isn't fed names as an input, but rather things Amazon wants to weigh like experience, awards and education.

joshuamorton 2808 days ago

This isn't correct, the worry isn't that a single group is small, its that a single group is large. (basically if one group is large, you can get by ignoring all the smaller groups).

This is most common with binary problems.

fuscy 2808 days ago

I'm going to make a supposition here but one of the first things I think they did (especially when trying to fix the AI) was to balance and normalize the data so that there would be no skew between men and women number of records in the data set.

If my supposition is correct then the other parameters are at fault here from which gender and language used stick out.

Another supposition I'm going to make is that they even removed the gender from the data set so that AI didn't know it, but cross-referencing still showed "faulty" results due to hidden bias that the AI can pick up, like language used.

kareemsabri 2808 days ago

If they did normalize the data across gender, then you’re correct it may indicate bias on Amazon’s part. But I don’t know about that. The article doesn’t provide enough information. I think it should be obvious, to Amazon as well, that if you want to repair inequality in a trait (gender) you can’t use an unequal dataset to train a machine to select people. I just don’t think it follows that machine bias must mirror human bias.

bilbo0s 2808 days ago

Did you read the article?

(Serious question. Not intended as snark. Genuinely wondering if I'm missing some deeper current in your post?)

kareemsabri 2808 days ago

Twice. It doesn’t support OP’s conclusions.

zby 2808 days ago

"they could be unbiased while still amassing a data set that skewed heavily male" - this sounds like a self contradiction

kareemsabri 2808 days ago

Is the NBA biased against white guys?

zby 2808 days ago

I don't know - is it? What is the difference between bias and inferring information from skewed data?

kareemsabri 2808 days ago

Bias, to me, is the active (perhaps unconscious) discrimination based on a trait. Skew is an unequal distribution of that trait as a result of bias in favor of other traits, historical circumstances, or anything other than discrimination.

The NBA wants good basketball players. If they happen to be white, I imagine they'd draft them with equal enthusiasm as any other player. So no, it isn't.

roenxi 2808 days ago

Do you have some information not present in the article? There seem to be some assumptions on the training process in your comment that are not sourced in the article.

I'll don my flack jacket for this one, but based on population statistics I believe a statistically significant number of women have children. A plausible hypothesis is that a typical female candidate is at a 9 month disadvantage against male employees and that that is a statistically significant effect detected by this Amazon tool.

Now, the article says that the results of the tool were 'nearly random', so that probably wasn't the issue. But just because the result of a machine learning process is biased does not indicate that the teacher is biased. It indicates that the data is biased, and bias always has a chance to be linked to real-world phenomenon.

brown9-2 2808 days ago

Does Amazon give 9 months of parental leave, or are you saying women employees are disadvantaged for their entire pregnancy?

roenxi 2808 days ago

Ah. Sorry, silly me. A quick search suggests 20 weeks, so ~4.5 months.

Obviously I don't have much specific insight, so maybe there is a culture where they don't use leave entitlements. But if there are indicators that identify a sub-population taking a potentially 20 week contiguous break it is entirely plausible that it would turn up as a statistically significant effect in an objective performance measure. All else being equal, then a machine learning model could pick up on that.

The point isn't that it is the be-all and end all, just that the model might be picking up on something real. There are actual differences in the physical world.

dheera 2808 days ago

The term "AI" is over-hyped. What we have now is advanced pattern recognition, not intelligence.

Pattern recognition will learn any biases in your training data. An intelligent enough* being does much more than pattern recognition -- intelligent beings have concepts of ethics, social responsibility, value systems, dreams, ideals, and is able to know what to look for and what to ignore in the process of learning.

A dumb pattern recognition algorithm aims to maximize its correctness. Gradient descent does exactly that. It wants to be correct as much of the time as possible. An intelligent enough being, on the other hand, has at least an idea of de-prioritizing mathematical correctness and putting ethics first.

Deep learning in its current state is emphatically NOT what I would call "intelligence" in that respect.

Google had a big media blooper when their algorithm mistakenly recognized a black person as a gorilla [0]. The fundamental problem here is that state-of-the-art machine learning is not intelligent enough. It sees dark-colored pixels with a face and goes "oh, gorilla". Nothing else. The very fact that people were offended by that is a sign that people are truly intelligent. The fact that the algorithm didn't even know it was offending people is a sign that the algorithm is stupid. Emotions, the ability to be offended, and the ability to understand what offends others, are all products of true intelligence.

If you used today's state-of-the-art machine learning, fed it real data from today's world, and asked it to classify them into [good people, criminals, terrorists], you would result in an algorithm that labels all black people as criminals and all people with black hair and beards as terrorists. The algorithm might even be the most mathematically correct model. The very fact that you (I sincerely hope) cringe at the above is a sign that YOU are intelligent and this algorithm is stupid.

*People are overall intelligent, and some people behave more intelligently than others. There are members of society that do unintelligent things, like stereotyping, over-generalization, and prejudice, and others who don't.

[0] https://www.theverge.com/2018/1/12/16882408/google-racist-go...

GuB-42 2808 days ago

We are pattern recognition machines. If you consider pattern matching unintelligent, then machines are more intelligent that we are since they rely more on logic than pattern matching.

For the black man = gorilla problem, an untaught human, a small child for instance, can easily make the same mistake. Especially if he has seen few black people. And well educated adults can also make the mistake initially, even if they hate to admit it.

However, in the last case, a second pattern recognition happen, one that matches the result of the image classifier with social rules. And it turns out that mixing black men and gorillas is a clear anti-pattern and anything that isn't certain is incorrect.

Unlike us, computer image classifiers typically aren't taught social rules, so like a small child, they will tell things without filter. It will probably change in the future for public facing AIs.

Not stereotyping is not a mark of intelligence, it is a mark of a certain type of education. And I don't see why it couldn't be done with the usual machine learning techniques.

dheera 2808 days ago

> social rules

I claim it isn't just social rules -- part of that is empathy, which is a manifestation of intelligence that I think is beyond pattern matching.

If a white person were mislabeled as a cat, it would be a cute funny mistake. Labeling people as dogs, not so much. Gorillas, even worse. Despite that gorillas are more intelligent and empathetic than cats. Oh, and bodybuilder white celebrity boxing champion as a gorilla, may actually be okay. The same guy as a dog, no. It makes no sense to a logic-based algorithm. But humans "get it".

A human gets it because they could imagine the mistake happening against them, with absolutely zero prior training data. You don't need to have seen 500 examples of people being called gorillas, cats, dogs, turtles and whatever else.

If you want to say that a hundred pattern recognition algorithms working together in a delicate way might manifest intelligence, I think that is possible. But the point is one task-specific lowly pattern recognition algorithm, which is today's state of the art, is pretty stupid.

ArchTypical 2807 days ago

> We are pattern recognition machines.

That's just one function. That's not the entirety of what the brain (and body) does.

> If you consider pattern matching unintelligent,

What do you think pattern matching IS? Round ball round hole does not require intelligence. It requires physics. The convoluted rube goldberg meat machine what we use to do it, doesn't change what it is. Making the choice of will and approximations, are more signs of intelligence, imo.

platz 2808 days ago

"a worldview built on the important of causation is being challenged by a preponderance of correlations. The possession of knowledge, which once meant an understanding of the past, is coming to mean an ability to predict the future." - Big Data (Schonberger & Cukier)

so, knowledge now is allegedly possession of the future, rather than possession of the past.

This is because the future and past are structurally the same thing in these models. Each could be missing, but re-creatable links.

Also, conflicting correlations can be shown all the time. if almost any correlation can be shown to be real, what's true? How do we deal with conflicting correlations?

IshKebab 2808 days ago

They didn't scrap it because of this gender problem. That wasn't why it failed. They scrapped it because it didn't work anyway.

Note the title is "Amazon scraps secret AI recruiting tool that showed bias against women" not "Amazon scraps secret AI recruiting tool because it showed bias against women". But I guess the real title is less clickbaity - "Amazon scraps secret AI recruiting tool because it didn't work".

stcredzero 2808 days ago

The same AI should be applied to hiring nurses and various other fields which show population skews in gender, as well as fields which are not skewed. I'd be curious as to the outcome.

Yoyoyou 2807 days ago

It failed because rationally interpreting gender data leads to politically incorrect conclusions.

hkai 2808 days ago

How did you come to the conclusion that gender was being the most important, rather than skills or aggressiveness?

kaitai 2808 days ago

I don't think that's what the parent was claiming; the parent says "gender and aggressiveness" were most important and skills listed on the resume as providing such an unclear signal for actual hires that they were not picked up by the AI.

Consultant32452 2808 days ago

Without regard to this particular issue, you also have to concern yourself with the bias of the person determining if the AI has a bias.

monochromatic 2808 days ago

> The AI becoming biased tells that the "teacher" was biased also.

That doesn’t follow.

s73v3r_ 2808 days ago

Someone had to decide on the training material. Note that saying that they had bias does not mean that they acted with malicious intent; most likely they didn't. That doesn't change the outcome, however.

louwrentius 2808 days ago

Thanks for spelling this out, I think this is exactly how to look at this.