Hacker News new | ask | show | jobs
by tgv 564 days ago
> According to forensic linguists, we all use language in a uniquely identifiable way that can be as incriminating as a fingerprint.

That's a bold and unproven statement, made worse because we can't really see that fingerprint.

2 comments

It sounds like a fairly accurate statement to me, considering that there isn't a solid scientifically-based foundation behind fingerprint matching either. They aren't quite as unique as we've often been led to believe, and matching them is highly subjective with the same expert often interpreting the same comparison differently when provided with a different story for context.

Fingerprint matching of course isn't completely useless, but it's not as solid as you'd hope either.

But when two sets of fingerprints, are different, you can be fairly sure they're from different people. But when the percentage of some features is 20% in one text, and 30% in another, you still can't conclude anything. I write in different registers in contexts such as personal emails, professional emails to a large group, professional emails to a direct colleague, a quick post on the internet, an 'app' to a friend in another country, a text message on a phone, etc. I even write them in different languages. It's hard to imagine there's a well-defined, properly grounded model that can unite those yet distinguish them from written output by other people.

And now LLMs are going to add more noise to these features...

there was this that unmasked alt HN users identity 2 years back using stylometric analysis from a previous comment dump

AFAIU the more people know of it the better expectations are set about real account privacy

https://news.ycombinator.com/item?id=33755016