Seeing a lot of "Well but what if it works" here, and I'll weigh in as someone who does AI - including CV-related applications involving convolutional neural networks - professionally and has participated in academic research in that field as well: I would estimate that roughly 80% of all "AI-based inference" that's currently being sold commercially is snake oil. Whether it's ad-tech, malware detection, facial recognition, whatever. There's definitely some promise to these systems, and there are some applications that have already started to bear out, but for the most part the industry is rife with bad data, bad data practices, lack of rigor in the way they scope and define their targets, and rushing products to market well before their claims actually meet even the criteria outlined for the project, let alone any standard of scientific evidence in this field or others. The fact that governments and companies that can make serious decisions with serious consequences are using these systems as-is is a serious problem that is already apparent in several places where they've been deployed.
For systems that claim to make any kind of psychological inference, I doubt there's a single one you should believe the claims of. Nearly half of human-performed psychology studies in the last 70 years have failed to replicate, including ones whose results have become "common knowledge" (As a field as a whole, it fared quite badly in what's called the "replication crisis"), and many of the very best supported of these supposed AI-psychology insights are, if you look into them, built on assumptions made by these results. Most of them fail to even do that, and rush a rigged result that doesn't generalize to market because they can get paid a fortune by their customers who buy the hype
100% of all AI hiring systems, AI proctors, AI recidivism predictors, AI drug-seeking classifiers, and anything that, like this article refers to, purports to infer personality traits from faces are both bullshit and dangerous. Maybe there could exist a reality where this wasn't true, but there is absolutely no solid reason right now to believe we live in that reality.
Totally agreed on the snake oil. But you can't deny there are definitely strong signals. For example, NN's can detect with 91% accuracy the sexual orientation of males, with only five images.
Also, not saying 91% is good enough for any government or business to commit to.
Have you read the actual paper? That figure is at best inflated, and their methodology is riddled with confounds, including but not limited to the role of social networking profile pictures as dating signals, intentional signaling such as grooming and makeup, and factors like the the socio-economic status and locale of the people being classified, which can be more predictive of declared sexual orientation than anything resulting from physiological features.
Oh right, it's probably worthwhile to note that since there are considerable reasons in many parts of the world to hide one's sexual orientation, and this study's design only conditions on reported sexual orientation on social media, the results are intrinsically skewed just by being from a population of out gay people
Also, bear in mind that if we take Wikipedia's reported rate of homosexuality in the general human population, for which 9% would be... pretty generous (The "Demographics of sexual orientation" article lists several statistics and I can't find a world aggregate, but e.g. San Francisco is 15%), a null-classifier that always guesses "straight" would be just as accurate. If the population levels were lower, more accurate
However, among the 100 of
585 individuals with the highest probability of being gay according to the classifier, 47 were gay. In
other words, the classifier provided for a nearly seven-fold improvement in precision over a random draw (47/7 = 6.71). The precision could be further increased by narrowing the targeted
subsample. Among 30 males with the highest probability of being gay, 23 were gay, an eleven-
fold improvement in precision over a random draw (23/2.1 = 11). Finally, among the top 10
individuals with the highest probability of being gay, 9 were indeed gay: a thirteen-fold
improvement in precision over a random draw.
Yes, and this further demonstrates how ridiculous the reporting on this result was. Their sample population was tiny and skewed, and this paragraph is a great example of why you can make your results look better to lay readers by reducing N (which in statistical terms should reduce your level of confidence in the result you got because the likelihood of spurious accuracy from random factors increases) and choosing whatever metric sounds best when you do that (here, they do so by only talking about precision - IE avoiding false positives, with no mention of the false negatives).
But I do have to give you some credit for providing a case study for why scientific literacy is super hard, and doubly so in a context where researchers are strongly incentivized to try to make their results sound as convincingly meaningful as possible
How do you explain the accuracy going up with more samples then?
Also, it wasn't all 90/10. It was all pairwise:
Among men, the classification accuracy equaled AUC = .81 when provided with one
image per person. This means that in 81% of randomly selected pairs—composed of one gay and
one heterosexual man—gay men were correctly ranked as more likely to be gay. The accuracy
grew significantly with the number of images available per person, reaching 91% for five
images. The accuracy was somewhat lower for women, ranging from 71% (one image) to 83%
(five images per person).
However, there's still a valid chance that the classifier relates more to any attributions provided in profile images than to facial features. (E.g., we're pretty good at inferring social status or role from medieval portraits based on such attributions without knowing any about phrenology or physiognomy.)
Prejudice is superstition, paradolia, and statistical undue weight fallacies like "one black person mugged me once, therefore from a sample size of one I am concluding that blacks are criminals."
> The reanimation of the pseudosciences of physiognomy and phrenology at scale through computer vision and machine learning is a matter of urgent concern.
When a computer can accurately predict (90%~) sexuality, criminal proclivity etc through facial features then what exactly is 'pseudoscience' about it?
Sure it can and will be abused but that doesn't mean to ignore it or label it as 'pseudo' simply because it hurts your fee-fees.
Because it would be the equivalent of a natural built-in evil bit and an extremely bizarre thing - for the same reason that the evil bit is unlikely to be used.
Any system that claims to work on that sort of input is almost certainly picking up socio-economic status of different races, or something similar, with no causal predictive power.
You are saying that there is no genetic component to personality? That's dumb. Are you saying there is no genetic component to facial features? Also dumb. Are you saying that there is no crossover whatsoever between the genetics that govern facial features and personality? Also dumb. There will be crossover above 0, it would extraordinary if there were 0 crossover. So there is likely some small correlation between facial features or skull shape and personality. How big is the effect, I don't know. The problem here is that you are judging the value of a person based upon their personality, not that their personality might be bound up in some way with their genetic makeup.
You've missed an angle - the causal link between genetics and personality is completely overwhelmed by the non-causal correlation between genetics and social status.
These models aren't going to pick up the correlation between facial structure and personality, they are going to pick up which families are high status and which are low, then provide the same pseudoscientific justifications for discriminating that people have been deploying since the dawn of pseudoscience.
Basically, these models are going to mislead people into thinking that a non-causal correlation is causal.
> Any system that claims to work on that sort of input is almost certainly picking up socio-economic status of different races, or something similar, with no causal predictive power.
I wonder which will have more predictive power, the version where you let the AI do it’s thing or the version where you intervene to correct for things that are almost certainly wrong according to you.
An AI doesn't do "it's thing", it learns with the bias the researcher encoded in the model, and most importantly in this case, with the massive bias of the datasets.
Correcting is just steering a bias from one way to another.
I also can't disprove the existence of unicorns, but I can cite a preponderance of the evidence.
Why is the shape of your face different than the lumps on your head? Even if you find a correlation in the data why would there be a causative relation? If I'm innocent one day but steal a loaf of bread do you expect the shape of my face to change? The idea makes no sense.
No, it won't change, but certain facial metrics may indicate proclivity.
This is statistics, so an n = 1 doesn't really help your argument.
I agree though, that physiognomic "accuracy" based on self-assessments is of little value, like most self-assessments, and not very different from tarot readings or online IQ or MBTI tests.
Now when there are external assessments, these types of correlations are to be handled carefully in social sciences, because they can be self-fulfilling prophecies (people don't trust you because you look untrustworthy, so you end up behaving the way that gets you treated that way anyway) or straight up spurious, so the independent variable(s), if any, can be extremely non-evident: nutrition, environmental, cultural... This is, again, not unlike intelligence tests.
The best hard data we have is on a less delicate subject: aggression in hockey, where certain facial features correlate with quantifiable aggressive behaviors [0].
Because the shapes of portions of our bodies do not betray our moral character. It is nonsense, and debating this issue is so tiresome. I've worked in facial recognition for quite some time, and thank gawd nobody where I've worked had over-reaching opinions of our software's capabilities. For example, the "emotion recognition AI" fraudulently being marketed - we howled in laughter when those frauds appeared. However, while interviewing at other FR/ML companies, I meet a horror of over-reaching attitudes. I guess someone can work in trained algorithms and yet carry a head full of conspiracy-theory quality logical connections. That must be the case, because physical shape cannot dictate moral character, and debating the issue is Kafkaesque.
Two separate things: first, you're correct, there are lots of attributes that can be accurately inferred from appearance (for varying accuracies). I don't see a point in pretending otherwise.
Second, unless demonstrated otherwise, most determinations you could make, e.g. wealth, are not causal, they are effectively a computerized stereotype that looks for some common features the majority of each class share. To me this means facial features are not a suitable basis for a decision anywhere you wouldn't feel comfortable stereotyping.
Put another way, you can easily propose rules that are correct on average but horribly unfair to those that don't conform to the rule. This is true for ML as it is for any other rules. The only ML specific thing here is maybe the basis for the predictions gets obscured to some as something deeper than it is.
Research, also of correlations (as a starting point), is to some of us "in general definitely a good thing". But.
Your use of 'accurately' probably does not consider actual sensible practices in the assessment of data, where you assess the occurrances of false positives and false negatives in more revealing considerations. One of the first online articles found through a quick search seems to be very good already as an introduction: https://towardsdatascience.com/accuracy-precision-recall-or-... (Koo Ping Shung, Accuracy, Precision, Recall or F1?, 2018).
Politically, there is a problem in fairly dealing with the matter of inclinations, especially considering that guilt is after actions, not inclinations, or considering that it amounts to "prejudice".
The use of 'pseudoscience' in the article was more political than theoretical - imprecise but left to the reader's margins of "getting the idea". It meant that "we have been there and the actual scientific results were poor (e.g. we could not predict local brain function under that bulge that may have meant inclination i)".
Science is much more complex than the simple correlation you seem to be supposing. Science is about understanding phenomena with objective grounds and methods (understanding is then corroborated with predictions, but predictions are not understanding). In your example you are limiting the matter to observations: they are the first step in science, not the last. (A statement like 'people with quality q tend towards inclination i' would be an observation, not a law.)
You make a good point of distinguishing between a scientific model and a predictive oracle. I think the political use of pseudoscience isn't helpful though. There could be a lot of understanding to be gained from these systems.
> I think the political use of pseudoscience isn't helpful though. There could be a lot of understanding to be gained from these systems.
If you meant, using the label of 'pseudoscience' to undermine research (in the broadest terms), or to promote blind faiths (e.g. scientism) or "arbitrary requirements for social membership", of course it is deterior (though the economic/financial matter is more complex). But in the specific context, the attribution of pseudoscience is (though often with little care and an improper naïvety) to be a substantially legitimate warning of "do not encourage shallow beliefs amounting to prejudices".
Many of us believe that similar research may reveal interesting correlations which may then trigger good insights. But there has been a trend (especially in cultures that have shown very little appreciation of subtlety as an ideal) that seem to encourage a return to the archaic ignorant use of stereotypes. Apply that to law enforcement, and - sorry, just inventing a sufficiently acceptable example, after today's article about "Irish Baileys" - the idea subtly or less subtly emerges to arrest sober Irishmen just because.
It is pseudo because it does none of the things you said with none of the accuracy you propose.
What it does well, is tell you whether the specific picture of a person you feed it looks roughly similar to other pictures of other people that belong to a certain category.
All of it's predictive power comes from the fact that the datasets they are trained on are completely imbalanced and that society has inherent biases, so it just picks up on that and magnify it.
I can guarantee you that a picture of a white male CEO in a suit and a picture of a black young adult in everyday clothing will score extremely differently on the model no matter what their personal criminal proclivity is.
Human (cops) do the same thing: they are used to a certain population being more at risk of criminal activities and thus they will control anyone from that population much more (e.g. stop and frisk in NYC).
This is illegal in most places. We are doing the same thing all over again, except now, we can do that through a blackblox brand "AI" on it, call it science and legalize it again.
Yeah, that -might- be true, but what if people just lie to the computer? Also, love the idea that computers figured out the knack in reading head bumps the humans just couldn't crack.
> love the idea that computers figured out the knack in reading head bumps the humans just couldn't crack
So, natural selection has evolved a criminal gene and a linked head bumps gene, but hasn't given us human the ability to detect that? That sure would have been useful.
Nature "knew" and patiently waited hundred of thousands of year, keeping that useless gene alive, that eventually computer vision would appear and finally allow us to detect it?
> When a computer can accurately predict (90%~) sexuality, criminal proclivity etc through facial features then what exactly is 'pseudoscience' about it?
90% is fairly poor regression model success rate. If you're in the 10% of those falsely accused of being a future murderous pedofile based solely on your facial bone structure, then imprisoned or institutionalized for that, I'd think your conclusions on this topic may change.
Remember the Blackstone ratio (1).
Also, physiognomy was a key attribute of Nazi eugenics goals (2) - that's the rabbit hole this work leads down.
It's not about protecting feelings, it's about remembering history and learning from mistakes to protect liberty.
Lastly, regarding the goal to predict a person's "sexuality' by any method, I would posit that the motives are most likely strongly against vice for liberty for all. What else would that information be used for, other than to oppress?
Surely you can discuss the matter yourself, given your use of the expression 'best way' (which would have required much more extensive defence on your side) and the number of competing factors which come to mind, such as education, societal development (environment/example, cohesion, organization), mental health, economic structure, opportunities, sanctions, law enforcement... Is not there a discipline of criminology?
My question was rhetorically stylized. I meant the best way between two opposing alternatives, i.e., between suppressing and embracing these technologies as tools in any of those fields you mention.
Ok then (now it's much clearer). Now, - «suppressing» - the article in some parts surely may sound emotionally censorial («should be anathema»), but, - «embracing» - is the real-world context one of getting elements for further study with all caveats and common sense in place, or one of hyped glamorous reliance? Is it for the criminologist or for an agent? If the HR uses an AI sceening to hire you (first example in the article), you will think you dodged a bullet not working there, but if it is the Country to make decisions relying on it (maybe "false positives are expendable when dealing with big numbers" - which also, per se, would be the opposite of justice), well, there's less than 200 around, and many of them will not just let you in just because you want to.
So, new elements to study: all very interesting. But: "Sir, I see you have the traits of those who smear themselves with bananas for pleasure" - there are a number of issues with it. You know, constructs like, e.g. from Armando Iannucci: «...the sort of racist brush that could only be wielded by a Scottish-Italian» (about himself, in Charm Offensive).
That's a devious analogy. Racism is not the same as acknowledging genetic differences. If people with darker skin are more likely to suffer from D vitamin deficiency, wouldn't we want to be informed about it? Reversely, should pale people simply ignore being burned by the sun when visiting tropical places?
Yes, but go back to the topic, to carry on with the idea of "physiognomy to defeat injustice". Blood type C correlates with increased shoplifting: then what? You study the matter further (good)? Or you approach Roger Slackermeyer strong of a 90% confidence of something (well...)?
> Blood type C correlates with increased shoplifting: then what?
Then you take appropriate measures, or you don't. Compare it to regular diseases. If you have a history of cancer in your family, you would probably want to go on regular screenings. Or if your family has a history of suicide, perhaps you want to pay special attention when your kids are feeling depressed.
Telling Alice and Bob apart visually because of their individual facial features, and possibly using that to uniquely identify them, is one thing.
Being convinced you can deduce Bob's character traits e.g. determine that Bob must have criminal tendencies based on the shape of his nose, is something very different.
This paper is about the later. About people doing CS research becoming convinced the later is possible, because their statistics dousing rod has found a correlation in their data set. And the very bad consequences that can follow from this.
You might want to at least read the abstract of the paper.
For systems that claim to make any kind of psychological inference, I doubt there's a single one you should believe the claims of. Nearly half of human-performed psychology studies in the last 70 years have failed to replicate, including ones whose results have become "common knowledge" (As a field as a whole, it fared quite badly in what's called the "replication crisis"), and many of the very best supported of these supposed AI-psychology insights are, if you look into them, built on assumptions made by these results. Most of them fail to even do that, and rush a rigged result that doesn't generalize to market because they can get paid a fortune by their customers who buy the hype
100% of all AI hiring systems, AI proctors, AI recidivism predictors, AI drug-seeking classifiers, and anything that, like this article refers to, purports to infer personality traits from faces are both bullshit and dangerous. Maybe there could exist a reality where this wasn't true, but there is absolutely no solid reason right now to believe we live in that reality.