Hacker News new | ask | show | jobs
by bluedino 1635 days ago
Gmails spam filtering isn't exactly a high bar
5 comments

Based on my experience running mail servers in the past (both personal and corporate), I'd say you're wrong.
Based on my experience running mail servers for a long time and today, I'd say OP is right.

gmail spam filtering is terrible. On my various gmail accounts, I both get spam and the good email goes to the spam folder. And there's nothing you can do, you can mark it not-spam a thousand times and it's still a crapshoot.

I don't have any of those problems with my self-hosted email.

I second this.

Gmail spam filtering is top notch. I just stopped to care to obfuscate or hide my email adress (which I use since the beta invitation program of gmail) and I can count the spam I actually read in a year with one hand.

Gmail's spam filtering has a high false positive rate.

It classifies Stripe's and PayPal's important security emails as spam; I posted previously on HN:

https://news.ycombinator.com/item?id=19536465

It's easy to bring down the number of false negatives if you allow the number of false positives to be arbitrarily large.

On my GSuite business email, I've had > 50 incoming business-relevant emails this year that were incorrectly classified as spam. My personal self-hosted email server [1] lets through a bit more spam than Gmail, but it also doesn't suffer this big false-positive rate.

[1]: https://nh2.me/recent/Running-your-own-mailserver.pdf

"I can count the spam I actually read in a year with one hand."

This is partly because Gmail is good at classifying emails as spam/ham.

But it's partly because it's more tolerant of false positives (ham sent to the spam folder) than you or I would be if we were tweaking our own spam filter.

I occasionally check my spam folder, and there are usually some mailing list emails that I don't care about, but which I did actually subscribe to, and would have wanted to reach my inbox.

> Gmail is good at classifying emails as spam/ham.

I wish they'd apply that discrimination to their SMTP output.

Seriously? I actually think gmail's spam filtering is brilliant - I probably average less than a single spam email a year that it doesn't catch.

Contrast that with every corporate email spam filter I've ever been subject to, which vary from "shit" to "OK", and Gmail is completely in another league.

My problem with Gmail is the false positives. (Or is it negatives?) They routinely send too much to the spam box and others tell me they have the same experience.

The worst is when they take email from one Google hosted domain and send it to spam in another Google hosted domain, even though the email didn't leave their network at all.

Still, I agree that the overall level is pretty good and hard to duplicate.

> even though the email didn't leave their network at all.

FYI gmail treats all of its children equally. Mail from one Google user to another is subject to the exact same treatment as mail received via SMTP (and, indeed, Gmail sends traffic to itself over SMTP). If you study the headers of messages in Gmail, you can form a picture of how they allocate and use the virtual IPs.

I get them every day.

https://imgur.com/a/wXCocLd

Have not had to deal with spam on my personal Gmail address in the 10 years I've been using it, and I'm having the same experience running a big Workspace organization. Their spam/fishing detection is making my job a lot easier.
I also have serious doubts about Google's spam fighting. While they catch a lot of spam in the spam folder, they are simultaneously overzealous, catching normal emails that I receive and read regularly, and underprepared, as if putting myusername@aol.com and sending the email to Gmail servers isn't totally obvious spam.
Gmails spam filtering is still the best I've seen from the major e-mail providers, so I disagree with your assessment.