|
Research in 2017 demonstrated a high level of accuracy in determining whether or not the person whose face was in an image was gay or not. 71% male, 81% female accuracy. When shown 5 pictures, accuracy jumped more than 10% in either case. This was with a relatively small neural network fine-tuned on a relatively tiny dataset of 33k images of faces from a dating profile site. If I had a million dollars I'd gladly wager it that some company with a deep dataset, like Google, could create a 99% or better profiler that goes just off a video of someone's face (not a single still image, but I'd bet that single image profiler could beat 90%) Transformers allow for a nearly arbitrary vector length for feature space - if sexuality correlates at all to any of a million different facial features, then neural networks will be able to detect it. If you're doing a binary "straight or not" test, without distinguishing between all the values of "not-straight" , then you could use a very shallow, very wide transformer architecture with a million features, and train it on a consumer card, and get accuracy in the 90% range. That initial study had technical flaws, not least of which was the binary classification of gay and straight, and only using white people. Technically, they used a base model, VGG-Face, which had a 4096 feature model and 17 convolutional layers. Human accuracy was rated about 50%, and was effectively a coin toss with a slight accuracy advantage for women. That's less powerful than something like nano-gpt. GPT-2 is orders of magnitude more complex and has a much higher degree of capability. If you did this with nuance and skill and high technical savvy, with a sophisticated model of sexual preferences (not the 1950's notion of
straight or not straight) you could get a very accurate and deeply creepy piece of software. This works for emotions, nonverbal communications, truthfulness, etc. Biometrics can provide a terrifyingly deep analysis of things you consider private and hidden but which nonetheless present in unintended evidence available for analysis. If you had a few hundred of these types of analyzers - say, for psychological factors, fitness, health issues, sexuality, political preference, etc, etc, then you could not only get a highly accurate snapshot of people through deanonymized bulk surveillance data freely available on the market, you could then create LLM models tuned specifically to the features and preferences of each individual, and then use A/B testing on your virtual populations to maximize engagement, force specific reactions and behaviors in response to media (timing, pacing, content, framing) , and so on, and so forth. We're not nearly as inscrutable, private, or resilient as many people think, and there's all sorts of data being misused already. Maybe we should get that universal digital bill of rights thing going before BlackRock or Honeywell or the DNC decide to go all in on AI. edit: To clarify, I'm not cheering this stuff on. No university would allow the study, and most companies would open themselves up to significant legal scrutiny if such a thing was ever used and they got caught, but this is a weekend project for a quant at a big firm - it'll cost them 20 hours and a case of red bull, with all the AI infrastructure out there, and the time, knowledge, effort, and cost to achieve things like this are dropping fast. |
key conditional embedded deeply in that comment.