Hacker News new | ask | show | jobs
by defanor 1509 days ago
A major theme in self-hosted email discussions is deliverability issues (particularly to larger email service providers), and I tend to be unsure how bad it actually is: sometimes it does seem pretty bad, other times it sounds like it's fine, and possibly the chatter about failed deliveries is caused by misconfigured servers and/or misunderstandings.

Seems like it shouldn't be hard to check and collect reference statistics with a survey, though I'm failing to find surveys of that kind, and getting accounts on public services would be the tricky part for me personally (since I don't like to provide my phone number), so not doing that myself either. Only occasionally tried to check it with others, and messages were delivered fine in those cases -- but that's just a few samples.

5 comments

The problem with deliverability issues is the impossibility of proving a negative.

If I send an e-mail to a company's customer support, or to my senator, or I reply to a potential client, or I contact an open source mailing list and I don't receive a reply - do I know if my message made it to them or not?

I mean, it's plausible that JohnDoe@senate.gov just didn't deign to reply to my e-mail. But it's equally plausible there's some subtle misconfiguration - like an e-mail forwarder that breaks the SPF signature. It's not like I can sign up for a senate.gov e-mail address to test with.

Meanwhile, to paraphrase an old joke, when your senator rejects your e-mails you have a problem. When your senator rejects @gmail.com they have a problem.

Sure, strictly speaking it's impossible to ensure that a message was actually read by a user even with automated end-to-end delivery acknowledgements and/or in centralized systems: UIs manage to gobble/hide messages, users fail to find how to open attached documents (and declare that those are missing), etc. But I imagine that a survey/statistics would still help to estimate how bad deliverability in general (in a variety of common cases) is: without that there are differing and even more vague ideas of its state.
I can prove a different negative with my own mailserver - when I've sent things to @gov, they've always been responded to. I think that just proves the government reads ALL their spam.
I can only contribute my own experience. I have a dedicated server with an IP address in a datacentre, with approx 6 users using my email server for their primary email. DKIM/DMARC/SPF all configured correctly. I also have policies that suspend logins for accounts if they send too much in a certain timeframe, because this is a pretty good indicator of account compromise. The limits would never be hit by humans.

I've had three issues. The first was delivering to outlook.com, but this was temporary and resolved relatively quickly: I simply contacted their support. At the time, they didn't bother to validate DKIM or DMARC according to their headers.

The second was a sender sending to us with a misconfigured SPF policy. I had quite strict rules that spf failure => user's junk folder that I had to relax, but I also had a discussion with the admins at the sending company to explain the issue.

The third was yahoo. For reasons known only to them, they decided that IPs they've never seen before will be blocked by returning an smtp deferral that is permanent, which is bad for legitimate mail servers because the email remains stuck in the mail queue forever. I ended up discussing this with their support also and after some discussion that block too was removed.

That's pretty much it. I receive dmarc reports now from many providers so I've an idea what percentage of our email is quarantined or rejected (none). I've been running email since 2011, for my own main email and a few others. I don't think deliverability is that much of an issue and I was able to resolve all the problems I've had in 10+ years of doing this by emailing support, explaining myself and asking to be unblocked. Usually this simply resulted in "OK but if you do bad things we will block you no guarantee of inbox delivery etc etc etc". That's fine. It seems that there is a large degree of per-account spam filtering as well at the big providers mapping to individual users' preferences.

Of course, if you don't set up SPF/DKIM/DMARC, or you have an IP with poor reputation (you can check the DNSBL) or worse a residential address, you will have trouble. I would generally look for a provider that has a relatively strict acceptable use policy, and in particular doesn't allow VPN endpoints to be run from their infra for your email, to reduce the chances your IP has a terrible reputation with the big providers. Also, join all the sender programmes, set reverse dns, don't let your users do things like send bulk email and that will reduce many of the problems.

The third was yahoo. For reasons known only to them

I had to work around them for some datacenter mail relays. The only solution I found was to sum up the number of mail relays behind a SNAT and then apply rate limits for their domain to not exceed 6 concurrent connections total per SNAT. To your point and AFAIK they do not document this anywhere.

I'm fairly sure they were abusing the standard with this particular technique. Deferred messages are OK if they can be retried later and that's what the MTA will try to do. Permanent deferral I suspect is really supposed to mean "we can't deliver right now and we don't know when but keep retrying".

What this does not do is trigger "undeliverable mail returned to sender" messages, so the end user has no idea their message is stuck in their own MTA's mailqueue until the MTA decides it has tried enough and gives up, and MTAs will usually persist for quite some time.

Spammers won't care what error code you send them or even worry about deferral messages, which is why the temporarily deferred spam trick works (first send is deferred for 1 hour, if you are a genuine MTA and you try again respecting this, it works). But permanent deferral, as I say, is very user hostile. The user thinks they've sent an email, but it isn't in the spam folder of the recipient. The sysadmin then has to go and dig to find out what exactly happened, and remove the mail from the mailqueue.

Luckily so far as I know we are only emailing a single yahoo address.

tl;dr the technique they are using is designed to handle the case where the receiving MTA is offline temporarily. There is a spam defence trick you can use and I don't object too much to that, but they used it to implement their block list rather than outright rejecting, and set the timeout to deferred indefinitely, which is just bad.

For me, thinking about potential deliverability issues is just too stressful. Even if it works 99% of the time, who knows how important the remaining 1% of emails will be. Personally, having control over my own domain is a good-enough middle ground.
Been doing this long enough that I can sense right away if people aren't getting my emails. It's not great to call people up and ask if they got an email, and admit that my mail might not be getting through today, but it's not terrible. Gives me a chance to touch base with them for a minute, since I was emailing them anyway.

Having people whitelist you on google / yahoo / msn because you explicitly ask them to does have a wider effect, as far as I can tell, of keeping your emails in the clear for everyone else.

But there's no deliverability guarantee with such a setup either. AIUI the premise of this thread is that the chances of delivery are higher with it, which may be the case (especially taking into account possibilities of misconfiguration on the sender's side), but that's precisely what made me to wonder about a survey/statistics once again: I wonder whether there's actually an observable difference.
> and I tend to be unsure how bad it actually is: sometimes it does seem pretty bad, other times it sounds like it's fine

That is pretty much it. One factor is that once you are on a blacklist it can spread like wildfire and be much faf to get off them all again, so the risk is small but the hassle if it happens is high. Also if you send mail for numerous people there is going to be a much higher risk: every extra user/account/address is an extra hack target (do all your users have good, non-shared, passwords?) or just extra volume that might be accidentally classified as junk (and once something from your server gets classed that way, future content may get more aggressively analysed and more mistakes may happen).

I've run my own mail server, including sending mail directly, for many years and to my knowledge not had a significant delivery problem. But I have a few mitigating factors: the IPv4 address is essentially on a commercial ISP range, not one that looks like a residential account or a VPS service provider, and the ISP is one that takes junk mail seriously, so there is less “splash damage” potential, and the same range has been used this way for several years (the main sender has moved around that small range, when testing upgrades on a copy VM for instance, but never away from it entirely) so it never looks like a brand new mail server these days, I only serve myself and a very small number of other users, our outgoing mail volume is pretty low.

It is a bigger problem for hosting services (much bigger user-base and little control over what they might send) or if you are sending from one of their ranges, if sending from a residential ISP address range, if your volume is high (perhaps you have apps that send mail as well as your personal mail?), etc., but it can be a problem for everyone.

I'm rebuilding my mail service soon (moving off Zimbra to just configuring the parts myself, as we don't need the extra features these days, it is too chunky for just a mail server, and at the end of next year they stop releasing easy install packages for the non-paid users (they already have for v9., next year v8. hits EOL)) at which point I might reconsider where it is hosted and if I should be sending via a paid SMTP relay to let them worry about deliverability, though as far as I know I've not had a problem.

>, and I tend to be unsure how bad it actually is: sometimes it does seem pretty bad, other times it sounds like it's fine, and possibly the chatter about failed deliveries is caused by misconfigured servers and/or misunderstandings.

It's not just misconfigured email server settings like DKIM, SPF, DMARC etc. One can correctly set all of those and still have the outgoing emails rejected or spamholed. Why? Because the big email players like GMail, Microsoft Outlook.com, etc use black-box heuristics of reputation datapoints that exist outside the boundaries of email settings such as... "amount of email volume", "# of spam abuse reports from ip block", etc.

Because "sender reputation" cannot be encoded into an email configuration (DKIM/SPF/DMARC/etc), that's why nobody can provide a convenient Docker container with a perfectly working self-hosted email server that can reliably send email. If such a thing existed, the spammers would use it as well!

A datapoint such as "volume of email from this ip" is an unstated behavior/activity number and not an identity setting like DKIM.

And the invisible heuristics keep changing which causes previous email setups that worked -- to later stop working for no obvious reason. Why? Because there's a constant arms race between spammers and email filter algorithms. This means others' email spam heuristics that keep evolving and that you don't control -- blocks your self-hosted outbound emails without warning.

That's why you have example of skilled admins who know what they're doing and had a working self-hosted setup for years suddenly getting their emails rejected: https://www.tablix.org/~avian/blog/archives/2019/04/google_i...

As to the contradicting anecdotes about the difficulties of self-hosting email, the issue is that the conversation shares the same unstated environments in comments about Uber or umbrellas that affects how the writer perceives the truth or relevance of their anecdote.

- "The problems of self-hosted email getting blocked is overstated. I've been doing it and it's working fine."

- "I'm not sure what value Uber provides. Taxi services have smartphone apps."

- "I'm not sure why people use umbrellas. Every time I walk outside, it's not raining."

As an example of evangelists and advice-givers not noticing their unstated environments... Back in October 2017, a commenter (lucb1e) argued[1] that I was exaggerating the difficulties of reliably sending email but a year later in 2019, he eventually confirmed the same difficulties! [2]

[1] https://news.ycombinator.com/item?id=15525505

[2] https://news.ycombinator.com/item?id=19757607

Oh yeah. This is very true, and getting worse every year.

I've had this discussion on HN before. It's gotten to the point where I've had to have my clients and their corporate lawyers go to bat against mail providers to maintain deliverability. No mail provider has any interest whatsoever in allowing an independent mailserver to continue delivering now.

So far, legal threats have worked when push came to shove against certain networks. But I imagine the difficulty is only going to increase.

Im extremely curious the legal precedents you used to accomplish this, particularly around forcing certain providers to un-spam or un-block your emails. What was the condition your legals found in order to do that?