Hacker News new | ask | show | jobs
by thrownaway2424 4256 days ago
Maybe. I feel like the spam number is misleading because OK, maybe the 70% of emails are spam, but spam is smaller than real traffic so it's not 70% of the size of the stored corpus. Also nobody stores spam, they delete it.
2 comments

Two organizations where I've worked at (1000s of people) managed their own email infrastructure, with their own spam filter.. it ranged from 70% to 75% of incoming emails.

Most spam emails are just a few words of plaintext, I'm surprised too the size of the stored corpus is spam in the same percentage. It seems like a few emails with medium-large attachments would outnumber the spam content which should be in the kbs/email at most.

These are both excellent points. I've done a bit more digging, and updated the article to reflect them. Thanks!