Hacker News new | ask | show | jobs
by BLKNSLVR 901 days ago
What kind of tiers are there for filtering?

Eg. Known bad domains, known bad IP addresses, incorrectly setup DKIM / SPF, no reverse DNS, non-matching reverse DNS, and that's before even looking at content to determine whether spam.

1 comments

For privacy and compliance reasons (read: “oh boy wouldn’t wanna get sued, eh?” reasons) we actually don’t snoop into the message body much. Hooray, good job on not doing the maximally big brother thing for once, MS!

My hot take is that this prolly won’t last because every org descends to doing a creepy level of data collection eventually so I have a textbook on privacy preserving ML downloaded for when we join the “surveillance but we found a way to make it technically legal” squad. We haven’t done that yet though.

What do you mean by tiers, exactly?

I was trying to ask generically because Microsoft deals with a universe-sized quantity of email traffic in comparison to my self-hosted barely used domains.

By tiers (which may be the wrong word, maybe just 'layers'), only relating to my setup, I mean things like:

- Tier 1: Spamhaus DROP and eDROP lists are outright blocked

- Tier 2: IP addresses that have illegitimately connected to my mail server ports are outright blocked (port scans, invalid login attempts, etc. - I manually check some of these against abuseipdb.com to determine their validity)

- Tier 3: IP addresses that have scanned non-open ports on my systems are outright blocked from connecting to my mail server ports

Just running these rules for a couple of months has dropped unwanted connections to my mail server ports a heavy percentage. One theory being that if you can block known-bad and highly-likely-bad connections, then actual spam detection (through email content review) is minimised to a certain degree.

I actually want to implement additional anti-spam IP address block lists and just haven't gotten around to it yet, but the above does a good enough job for my essentially unknown domains (as I said, a universe of difference to what Microsoft has to deal with)

- Tier 4: Black-box spam detection built-in to the all-in-one mail server solution I use (I don't know how it works, I don't know how to edit the 'rules' or even if I can).

'Tiers' I would expect Microsoft to have would be:

- Their own lists of known-bad IP addresses / ranges / ASNs

- Reverse DNS lookup validation

- DKIM checks

- SPF checks

- More protocol level 'things' beyond the understanding of a simple network admin such as myself.

- Weighting the results of all of the above to determine some kind of 'spam likelihood' score.

All of this is before reviewing the content of the actual message.