Hacker News new | ask | show | jobs
by 15characterslon 1635 days ago
> It was easier than I thought to create a mail server that works as well as Gmail’s

No it isn't and no you didn't.

The article doesn't even cover basic stuff like email rules and spam filtering (incl. tuning and spam learning). It doesn't "look after itself" like the author wanted (article doesn't mention any update strategy). The author acknowledges that email servers are "open to attack" but this setup doesn't seem to include any security improvements over traditional setups. In fact, maintaining this looks harder due to the amount of custom scripts and lack of good documentation.

And of course it doesn't cover any of the things that actually make Gmail special like labels, having a consistent set of apps for web and mobile, push notifications (esp. on iOS), really good spam filtering, really good search (incl. OCR for attachments), high availability, image proxying, smart suggestions, datacenter security, Google doing code and infrastructure audits all the time, using reproducible builds, ...

It's great that the author is experimenting and learning, but if I had any private data hosted by the author, I would be worried now.

5 comments

> if I had any private data hosted by the author, I would be worried now.

Merry Christmas to you as well.

Such negativity for just showing something I knocked up in half an hour. - something that I thought might be helpful, with experiences on how to make it more Gmail like.

Attacking the writing is fine, but insinuating my custody of private data is at question is pretty shitty

> Such negativity for just showing something I knocked up in half an hour. - something that I thought might be helpful, with experiences on how to make it more Gmail like.

GP's feedback is direct but quite right imo. I trust the author had only best intentions in mind but "Knocking something out in half an hour" and sharing, but good privacy and security engineering requires probably much more time. Quite frankly, the wording of the article can be insulting even for folks that are working on that problem professionally for several years.

Were it presented differently, it would get different feedback I'm sure. More like "hey HN, i made the first three steps what would be next?" -- i.e. efforts towards trying to understand the problem better.

It's not negativity. You wrote an article showing you clearly don't understand at all what are the stakes or what you are doing.

What you did is a basic setup which was covered in O'Reilly's TCP/IP book back in 1996. World has changed since.

Please learn from the community here.

Docker, cloud volumes, SpamAssassin, Dovecot, ClamAV, fail2ban, DKIM, DMARC. Ask what these are to someone in 1996 and see what you’d get back. The article covered setting all of these up

However my main objection to the OC was attacks on my professionalism. Unless you’re going to defend that, I don’t really care

I think the author and submitter got exactly what they asked for by posting something to HN that is by their own admission low-effort as it took only 30 minutes to knock it out. Many of the commenters call this out, since in their opinion the content does not hold up to what the headline promises.

Also, one aspect of professionalism is also to be thankful for the feedback rather than trying to interpret it as attacks.

I am the author.

> if I had any private data hosted by the author, I would be worried now.

Is an attack on me personally, nothing about the article. The article also took much longer than 30 minutes to 'knock out' - more like 3 hours all in all

I didn't mean to insult you. I think it's great if you're experimenting and I fully support that. It's just that the headline set high expectations and the article reads like this is being used in production, which I would strongly advise against.
> that actually make Gmail special like labels

I hate labels.

At $WORK we use Gmail and I get a lot of automated stuff (cron, etc). I want these types of message to go into folders. I don't want it in my "all" / archive area because they just clutter up searching for other things.

Perhaps labels work for other people / general public, but for me 'traditional' folders is how things work best.

> I hate labels.

Serioulys! gmail labels are a very poor mis-implementation of folders that just make a mess of sorting email.

I second that, my worst problem is spam filtering. The rest I have set up, except DKIM and DMARC which are not worth bothering with.
For me personally, one of the most effective means of knocking out the first 95% of spam was using the S25R regex methodology [1] created by Asami Hideo which seems to keep the load on SpamAssassin and ClamAV really low. I've had to adjust the regex rules over the years a little bit but it's really low maintenance for my setup. There are also lists of IP addresses and networks you can block that are known to be malicious which also reduces the load and log volume. [2]

[1] - http://www.gabacho-net.jp/en/anti-spam/anti-spam-system.html [No HTTPS, Sorry]

[2] - https://github.com/firehol/blocklist-ipsets.git

Thanks a lot for this ! I’ll try them out too
In my experience spamassassin works wonderfully. There are some few false negatives (1-2 mails per week), but I did not have a single false positive which is very important for me. For example Google is much worse in that regard, which forces me to check spam every few days to ensure no legitimate mail ends up in spam, so it's like no spam filtering at all, I have to read it all anyway.
> basic stuff like email rules

I mean gmail has the most limited frustrating filtering (lack thereof) rules of any email system I've used. Any self-hosted solution will be infinitely better.

> Gmail special like labels

How is that special?

Also, gmail spam filtering is not very good. You know how every business has that "check your spam folder" bit? Because gmail is so terrible about it. It is easy to do much better with a self-hosted solution, put an end to the false positives of gmail.

Seems that you like gmail, but in my experience it's one of the worst mainstream email implementations ever. Doing better is a trivial bar.

Gmails spam filtering isn't exactly a high bar
Based on my experience running mail servers in the past (both personal and corporate), I'd say you're wrong.
Based on my experience running mail servers for a long time and today, I'd say OP is right.

gmail spam filtering is terrible. On my various gmail accounts, I both get spam and the good email goes to the spam folder. And there's nothing you can do, you can mark it not-spam a thousand times and it's still a crapshoot.

I don't have any of those problems with my self-hosted email.

I second this.

Gmail spam filtering is top notch. I just stopped to care to obfuscate or hide my email adress (which I use since the beta invitation program of gmail) and I can count the spam I actually read in a year with one hand.

Gmail's spam filtering has a high false positive rate.

It classifies Stripe's and PayPal's important security emails as spam; I posted previously on HN:

https://news.ycombinator.com/item?id=19536465

It's easy to bring down the number of false negatives if you allow the number of false positives to be arbitrarily large.

On my GSuite business email, I've had > 50 incoming business-relevant emails this year that were incorrectly classified as spam. My personal self-hosted email server [1] lets through a bit more spam than Gmail, but it also doesn't suffer this big false-positive rate.

[1]: https://nh2.me/recent/Running-your-own-mailserver.pdf

"I can count the spam I actually read in a year with one hand."

This is partly because Gmail is good at classifying emails as spam/ham.

But it's partly because it's more tolerant of false positives (ham sent to the spam folder) than you or I would be if we were tweaking our own spam filter.

I occasionally check my spam folder, and there are usually some mailing list emails that I don't care about, but which I did actually subscribe to, and would have wanted to reach my inbox.

> Gmail is good at classifying emails as spam/ham.

I wish they'd apply that discrimination to their SMTP output.

Seriously? I actually think gmail's spam filtering is brilliant - I probably average less than a single spam email a year that it doesn't catch.

Contrast that with every corporate email spam filter I've ever been subject to, which vary from "shit" to "OK", and Gmail is completely in another league.

My problem with Gmail is the false positives. (Or is it negatives?) They routinely send too much to the spam box and others tell me they have the same experience.

The worst is when they take email from one Google hosted domain and send it to spam in another Google hosted domain, even though the email didn't leave their network at all.

Still, I agree that the overall level is pretty good and hard to duplicate.

> even though the email didn't leave their network at all.

FYI gmail treats all of its children equally. Mail from one Google user to another is subject to the exact same treatment as mail received via SMTP (and, indeed, Gmail sends traffic to itself over SMTP). If you study the headers of messages in Gmail, you can form a picture of how they allocate and use the virtual IPs.

I get them every day.

https://imgur.com/a/wXCocLd

Have not had to deal with spam on my personal Gmail address in the 10 years I've been using it, and I'm having the same experience running a big Workspace organization. Their spam/fishing detection is making my job a lot easier.
I also have serious doubts about Google's spam fighting. While they catch a lot of spam in the spam folder, they are simultaneously overzealous, catching normal emails that I receive and read regularly, and underprepared, as if putting myusername@aol.com and sending the email to Gmail servers isn't totally obvious spam.
Gmails spam filtering is still the best I've seen from the major e-mail providers, so I disagree with your assessment.