|
|
|
|
|
by dlkf
2170 days ago
|
|
I think this is a great first stab at the problem, but for two reasons I think a robust solution needs more work: - The first is that, as someone else pointed out, Google is almost certainly logging your translation queries. - Secondly, even if you do it offline (as someone else suggested) the approach itself might not work. Success in linguistic forensics isn't based (as we might naively assume) on catching obscure words that a particular individual has a tendency to overuse. It's based on subtle shifts in the relative frequency of functional words. Depending on the proximity of the source and target language, round-trip machine translation might not change this. |
|
I got interested in forensic linguistics many years ago when an article in a somewhat shady publication mentioned me. I got curious and started reading anything I could find on the topic. I was eventually able to identify the author, but mostly by tricking him to admit it after I had a ranked list of candidates. He was second on a list of about 4-5 people (out of a candidate set of perhaps 300). Not half bad for the rather crude methods I used. I was rather pleased with myself.
I've used similar techniques later to look at influence networks in companies.