Hacker News new | ask | show | jobs
by bergie 4789 days ago
As usual with Google, they're still not handling multilingual situations too well. It has been a long time since I've seen any spam in English, but tens of mails in languages like Finnish and Georgian get through every day.

I wonder if the easiest solution would be a language blacklist. I have no legitimate reason to want to receive email in languages I don't understand.

2 comments

This is already available in some SMTP servers.

But it's not a cut and dried situation. It's hard to identify language in very short emails. It's quite possible to get false positives too, when you get mail from someone with an accented name. Language detection gives you a probability that it's in a particular language, not a yes/no.

I'm French and receive email in English and French; spam detection seems to work just as well in those two languages.