Hacker News new | ask | show | jobs
Ask HN: Is there a list of definite spam words/phrases?
2 points by throwaway000021 3261 days ago
It appears to me that spam systems work out the probability that some item of mail is spam based on some sort of algorithms.

However, recently I have been processing hundreds of thousands of emails including hundreds of thousands of items of spam.

I can see in the spam emails that there are many words and phrases used in the subject lines that will simply never appear in any valid email that I care about receiving. For example:

H00kup

F*ckbuddy

Xenical

Viagra

BangBuddy

In addition to the single words, there are various phrases in the subject lines that would never appear in a legit email, such as:

Affordable lux Copy watches

accessories for cheap

Designer timepieces for all tastes

Time to look rich

So it strikes me that maybe there is a list of words and phrases that, if found in an email subject line, definitely identify that email as spam.

Does anyone know if such a list exists? A list of words/phrases that would only ever be found on a spam email?

1 comments

I came across this 'STOP words' https://en.wikipedia.org/wiki/Stop_words yesterday. Wiki says these word list can vary from services to services. There is hardly any chance you'll find a complete list, maybe common words are available.