Hacker News new | ask | show | jobs
by ipsa 2434 days ago
The fruit machine was reincarnated for pedosexuals: a device attached to their genitals measures if they get sexual arousal from pictures of children. Those that do are not deemed ready for rehabilitation.

Where most people yell scam or digital phrenology, I have a somewhat contrarian view: These systems do work. It is possible to tell, better than random guessing, if someone is gay or has a violent disposition, with just a single picture. Prisons for violent crimes see way more inmates that are bald, bearded, acned, square-jawed (signs of high testosterone). Replicated studies have shown that the profile pictures of gay men are significantly different from straight men, from subtle effects, such as more attention to grooming, to more physically noticable, like shape of the jaw being more rounded.

I have no reason to disbelief that an automated system could check for tell-tale signs that someone is hiding something: needing a lot of time to answer basic questions, using their lead hand to cover their chin, looking not in the direction commonly associated with recall, but that of imagination, trembling voice, anxious eye twitches, etcetera.

This is what flawed human border guards are already doing. Isreal has the most advanced airport security and trains European and American border guards to detect suspicious behavior. The TSA has over 3000 behavior detection agents. These are people with their own political and religious beliefs, prejudices, and variance -- and they can't be audited rigorously. We just never heard the accuracy, so we can't say if lie detectors can beat this (or can help as a human tool). But I bet they can.

I was dissapointed that the actual video chat with the journalist and the digital border guard was not included in the investigative article. They argue that the system be interpretable, but give no full transparency themselves. I'd trust that she did not tell any lies, but I don't trust that they did not try to game/fool the system, as to have an actual article to write about. Anyway, using just one test subject is majorly flawed, and comes close to not understanding that science can't provide 100% accurate predictions, just probabilities. I feel it is a reasoning flaw to discard any automated system, by honing in on a single mistake.

3 comments

Why would you wait until someone is no longer a pedophile before deeming them ready for rehailitation?

Sure, one could grant that gay people on average are slightly more feministic, and criminals on average are more testosteronistic (as are athletes and law enforcement officers).

How do you define "work"? How is this information usable, even slighly, in a security context, overcome the completely predictable shitshow that it will create in practice?

You say "telltale", but that's not supported by the evidence.

> Israel has the most advanced airport security

Because they have highly trained officers interrogating people and searching packages, not running AI dowsing rods.

Rehabilitation in society: most people do not want convicted pedosexuals who show no signs of betterment to be around children, just like most people do not want murderers released when they say to the prison doctor that they still have an urge to kill.

Work, as in serve as a double-check for a human border agent. If someone failed to correctly (as deemed by a reasonably accurate system) answer all 16 questions, I do not want to fly with that person, before a border guard has had a second look. This is how fraud detection often works: An automated system gives a high score, and possible explanations for this score, and then a human analyst can make a more informed decision.

Here are some telltale signs that someone is lying: https://parade.com/57236/viannguyen/former-cia-officers-shar... & https://www.businessinsider.com/how-to-tell-someones-lying-b...

Model-performance based accuracy (both human and artificial neural networks) supports the evidence for efficiency.

> Because they have highly trained officers interrogating people and searching packages, not running AI dowsing rods.

These highly trained officers also sit behind a video camera to observe passengers. Do you think detecting suspicious behavior from video is AGI-complete? BTW: Isreal invests a lot into large scale face detection at its borders, has plenty of intelligent hardware devices aiding its security, uses statistics to skip a pat-down of a 5-year old Isreali boy, they track cars the moment they enter the parking lot and track the time there -- and cross-reference if this car has been near the border or power plants, they may (not sure) do social media analysis, like the US is doing now, the Isreali army unit Intelligence Corps 8200 is actively supporting airport security, the Isreali border patrol focuses all their attention on passengers, and not their luggage (why search their luggage after they've been cleared by a behavior check?), they use TraceGuard to swab clothes for substances, they have a similar Suspect Detection System called VR-1000 which automatically checks for signs of lies, such as profuse body sweat and eye movements, BellSecure ties up all sources of information on the web and in databases to get a better no-fly list, they track their own border agents with automated systems to spot opportunities for learning and malbehavior, WeCU also automatically checks facial clues, they have automated weapon scan systems, Vigilant's surveillance systems are deployed in Israel and the US and act as a digital border guard and motion/gait recognizer.

What may sound like an AI dowsing rod to you, could actually help combat airline terrorism.

> WeCU Technologies (as in "we see you") is a technology company based in Israel that is developing a "mind reading" technology for the purpose of detecting terrorists at airports. The company's products evaluate reactions to specific images for indications that someone is a potential threat.

> The technology involves projecting an image that only a terrorist would be likely to recognize onto a screen. The idea is that people always react when they see a familiar image in an unexpected location. For example, if a person unexpectedly saw an image of their own mother on the screen, their face and body would react. For the terrorist detection, the people passing by the screen would be monitored partly by humans, but mostly by hidden cameras or sensors that are capable of detecting slight increases in body temperature and heart rate. Other detection devices, which are more sensitive and currently under development, could be added later.

>> Here are some telltale signs that someone is lying: https://parade.com/57236/viannguyen/former-cia-officers-shar.... & https://www.businessinsider.com/how-to-tell-someones-lying-b....

That is all bunkum as evidenced by the sources quoted (business insider?).

Just to hand-pick an example I find particularly egregious - that touching one's face is a sign of lying. This guy would disagree:

https://www.youtube.com/watch?v=HlmNqwEhGIk

(Zizek ticking)

No the sources are from CIA and FBI agents trained in interrogation and spotting lies (and wanting to sell their books, like researchers want their research read). One of the agents used these signs to know that Timothy McVeigh was lying. They also give a counter to your hand-pick: Observe the person when they are not lying/natural environment, note any ticks, and discount these when interrogating.

Place your lead hand thumb on your cheek and two fingers on your chin and imagine you are talking to someone standing one meter from you. Do you feel sincere?

There is plenty of research that show that lie detection is not all bunkum, and that techniques such as cognitive overloading help catch lies and lower defenses (which need focus and don't come naturally to most people).

>> Place your lead hand thumb on your cheek and two fingers on your chin and imagine you are talking to someone standing one meter from you. Do you feel sincere?

I really can't think of anything I could do that could make me feel insincere when I was being sincere. This sounds a bit like the discredited claims about power-posing, or smiling to feel better etc.

I'm sorry but I really think you're letting yourself be taken in by some extraordinarily shoddy science and by the pseudo-scientific claims of people who are either engaging in magickal thinking and really believe they can "tell when you're lying" or just charlatans trying to take advantage of the naivete of others.

> I'm sorry but I really think you're letting yourself be taken in by some extraordinarily shoddy science and by the pseudo-scientific claims of people who are either engaging in magickal thinking and really believe they can "tell when you're lying" or just charlatans trying to take advantage of the naivete of others.

Did you win the Putnam?

Please do provide sources for all these claims
You're citing the Stanford gaydar paper, a pseudo-scientific attempt to cash in on the hype about neural nets. It was widely condemned for its ethical and technical deficiencies at the time.

e.g.:

https://thenextweb.com/artificial-intelligence/2018/02/20/op...

Edit: to clarify, I'm also interested in why you think all you say in your comment is true. The sources you cite either do not support your claims, or are disreputable like the deep gaydar paper [edit: or they are irrelevant like the sources about the training of border agents].

For example, I quote from the Wikipedia article on the plethysmograph:

>> 1998 large-scale meta-analytic review of the scientific reports demonstrated that phallometric response to stimuli depicting children, though only 32% accurate, had the highest accuracy among methods of identifying which sexual offenders will go on to commit new sexual crimes.

32% accuracy means those tests are incapable of detecting whatever they're looking for. Even if other tests are worse. My dowsing rod is better than my crystal ball at finding water, but that doesn't make it accurate.

> The sources you cite either do not support your claims, or are disreputable like the deep gaydar paper

"Measuring sexual arousal: https://en.wikipedia.org/wiki/Penile_plethysmograph & https://en.wikipedia.org/wiki/A_Place_for_Paedophiles" certainly seems to support the first claim: "The fruit machine was reincarnated for pedosexuals: a device attached to their genitals measures if they get sexual arousal from pictures of children. Those that do are not deemed ready for rehabilitation."

But the parent maintains that "these systems do work" when the wikipedia page says the opposite is true.
No. This is what the Wikipedia page says for measuring sexual response in pedosexuals:

> In one study, 21% of the subjects were excluded for various reasons, including "the subject's erotic age-preference was uncertain and his phallometrically diagnosed sex-preference was the same as his verbal claim" and attempts to influence the outcome of the test.[28] This study found the sensitivity for identifying pedohebephilia in sexual offenders against children admitting to this interest to be 100%. In addition, the sensitivity for this phallometric test in partially admitting sexual offenders against children was found to be 77% and for denying sexual offenders against children to be 58%. The specificity of this volumetric phallometric test for pedohebephilia was estimated to be 95%.

> Further studies by Freund have estimated the sensitivity of a volumetric test for pedohebephilia to be 35% for sexual offenders against children with a single female victim, 70% for those with two or more female victims, 77% for those offenders with one male victim, and 84% for those with two or more male victims.[30] In this study, the specificity of the test was estimated to be 81% in community males and 97% in sexual offenders against adults. In a similar study, the sensitivity of a volumetric test for pedophilia to be 62% for sexual offenders against children with a single female victim, 90% for those with two or more female victims, 76% for those offenders with one male victim, and 95% for those with two or more male victims.[31]

> In a separate study, sensitivity of the method to distinguish between pedohebephilic men from non-pedohebephilic men was estimated between 29% and 61% depending on subgroup.[27] Specifically, sensitivity was estimated to be 61% for sexual offenders against children with 3 or more victims and 34% in incest offenders. The specificity of the test using a sample of sexual offenders against adults was 96% and the area under the curve for the test was estimated to be .86. Further research by this group found the specificity of this test to be 83% in a sample of non-offenders.[32] More recent research has found volumetric phallometry to have a sensitivity of 72% for pedophilia, 70% for hebephilia, and 75% for pedohebephilia and a specificity of 95%, 91%, and 91% for these paraphilias, respectively.

These systems work! And, while scary, or invasive, or not 100% accurate, this is no argument to reason that they don't.

There has been no peer-reviewed paper calling in question the gaydar paper. There has been a master student who tried to replicate the study with his own crawled dataset, and got better than human guessing, but slightly below the paper accuracy. News outlets ran with that to say that the study was flawed. Another was by a Googler who claimed that the neural net solely looked at eye shadow or glasses, but he also got better than random and human guessing on his own sanitized dataset, and, one could argue that eye shadow and glasses are fair game when classifying from a face picture, as they are included in the picture, and these pictures were also shown to the human evaluators (even ground).

The next web article is by a journalist with a history degree, not an ML scientist. But based solely on the merit of his arguments, he also agrees with the results of the paper:

> there’s nothing wrong with the paper and all the science (that can actually be reviewed) obviously checks out.

and seems to take more issue with the ethical considerations, binary sexuality, and builds his point around: humans have no functioning gaydar at all, so it is insignificant that a neural net could beat a coin flip. His point is weak, as he gives no evidence for humans lacking a gaydar, and the paper (which was not wrong as claimed) includes human assessments which are higher than random guessing.

I think my contrarian view is true from mere pragmatism: Israel has the best airport security in the world, and uses these Suspect Detection Systems extensively, seemingly constantly improving and making enough profit for new players to enter the market. AKA the people that actually do this for a living keep innovating on it, and I find that rather unlikely if all of this is tea leaf reading.

I think, in general, that the HN crowd overreacts when it comes to controversial tech, and that a simplistic "this does not work, and is a sham, and fraud to take research money" is an uninformed weak claim. It takes a lot of chutzpah to denounce the many months work of legit scientists as obviously flawed from behind your keyboard when one probably has not even read the full paper. The authors, by picking such a controversial topic, are partly to blame for this pushback and popular media reporting, but that does not make it right.

I will not defend the use of plethysmograph and eye tracking studies to measure a sexual response. Just claim that it is better than random guessing, it allows for better treatment when measurements are out of line with self-reports, and that it is still in use and very similar to the Fruit Machine. The Fruit Machine is already back.

> My dowsing rod is better than my crystal ball at finding water,

This I do not get what you refer too (I know you as a ML knowledgable person from your other comments, so I am afraid to assume things, but if your crystal ball is random, and your dowsing rod is better than random, you are succesfully doing predictive modeling, no, not a sham? [1]). These systems do not need extremely high accuracy, if they do not auto-deny a person, and it is changing the goal posts a bit to demand accuracy when better than random guessing has been demonstrated (which is questioned by the majority of the commenters here).

> or they are irrelevant like the sources about the training of border agents

User kindly requested sources for all of my claims. I claimed this and sourced it. My point was that we already have human Suspect Detection Systems in place, so either those must go (you have a fundamental problem with SDS's) or they can't be automated (because you don't trust AI research or believe these systems need common sense problem solved first). I could then offer counter-arguments to both.

For the question about the eye direction, look at the sourcing for telltale signs of lies I posted in reply to another commenter. It depends on if you are left- or right handed.

[1] > A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. - The Strength of Weak Learnability

Regarding the gaydar paper, yes, I have read the full paper (if memory serves, I read two versions, a pre-print and the published paper). At the time, I wanted to publish a rebuttal, perhaps a letter in a journal or something, but in the end I didn't think I'd be adding much to the debate and the paper had been widely discredited already anyway.

My objection with the methodology in the paper was that the authors had assembled a dataset where the distribution of gay men and women was 50% of the population, i.e. there were as many gay women as straight and as many gay men as straight in the data. This was for one of their datasets, the one were everyone had a picture. There were two more where the distribution was less even but still nothing like what it's usually estimated to be. This despite the fact that the paper itself cited a result that gay men and women are around 7% of the population.

The reason for this discrepancy was clearly to improve the results by reducing the number of false negatives which are expected when there are many more negative than positive examples in binary classification.

This from the point of view of machine learning. There were other flaws that others pointed out, e.g. the choice of metric (I don't remember what it was now, I can look it up if you like), the premising of the paper on prenatal hormone theory that is another piece of bunkum without any evidence to back it etc.

And of course there were the ethical considerations.

Sorry but I don't have the courage to reply to the rest of your comment. You write way too much.

Rebalancing an imbalanced dataset is common in industry and academicia. You use that when you focus on accuracy, to make claims like: We were 54% accurate on classifying sexuality of females easily interpretable, without needing a distribution-balanced benchmark (you simply know it is a coin flip).

If there is signal in the rebalanced dataset, there should be signal in the imbalanced dataset. If they'd switched to logloss or AUC and an imbalanced dataset, do you think now their results would be as good as random? Because that is what you are implying and you are basically implying the research is fraudulent. This is a very strong claim to make, in the absence of legit discrediting studies that failed to replicate any predictability, and requires more than guessing the authors rebalancing act was "clearly" to improve the accuracy (with 7% negative class, you could get 93% accuracy by always predicting positive class, so if they wanted to inflate the accuracy, they shouldn't have rebalanced).

The ethical considerations are moot/personal opinion, as they passed the ethics board of Stanford. Those are people who evaluate ethics of academic research for a living, or are you saying they were also shoddy and wrong to give this a pass?

Magical thinking is not wanting something to be true, because it would be an uncomfortable truth, and so deeming that something which is objectively true, must be false, so you can continue to think happy thoughts in line with your world view.

You keep talking about the paper being widely discredited, but can't provide a single academic source for this. Instead, you question my sources (business insider?) while posting articles from The Next Web written by a History degree journalist who does not want the concept of binary sexuality to be true, or even allow it in constructing a dataset of gay and straight people by self-classification.

It takes more energy and letters to attack a point than to make a point. You made quite a lot of weak points.

> On deep nets having better gaydars than average human: https://psycnet.apa.org/doiLanding?doi=10.1037%2Fpspa0000098

Didn't that recognition system boil down to being an eyeglass and eyeshadow detector?

No. (and I feel there is no justification for downvoting requested sources).
>> looking not in the direction commonly associated with recall

Which direction is that?