Hacker News new | ask | show | jobs
by AndrewSnow 2995 days ago
Absolutely. I don’t blame people who don’t interact with email for not reading the relevant RFCs, but not verifying control and expecting local part uniqueness to mean uniqueness of users is obviously busted to anyone who has worked with email. This is Netflix failing to understand part of their product surface.

Both the dot behavior and the even more common ‘+’ feature are perfectly spec compliant.

4 comments

It really bothers me the number of web services which reject email addresses containing '+' in the local part.

If you're going to try to "validate" an email address, read the goddamn RFCs.

One of the two reasons i changed my recipient_delimiter parameter to '.'

The other would be that spammers know that anything after a + is usually optional and strip it. Can't do that when the delimiter is a '.'

I do the same thing, for the same reason. I haven't A/B tested both options or anything, but I know I've gotten spam where they stripped the "+" parameter.
I also did that, but I made '_' work, too. Part of me wants to add 'x' as an alternative just for further obfuscation.
Try using an email address with .wedding or .solutions TLD. Loads of absolutely brain-dead sites refuse to allow them, sometimes they validate by TLD length (all TLDs are 2 or 3 characters, apparently) or other times rejection TLDs they haven't whitelisted.
"(all TLDs are 2 or 3 characters, apparently)"

Which is odd, since there are some very, very old TLDs that are "long" ... I am thinking of .bitnet, .uucp and even the old .ussr[1] ...

[1] "Initially, before two-letter ccTLDs became standard, the Soviet Union was to receive a .ussr domain." (https://en.wikipedia.org/wiki/.su)

Those were never in widespread use in the email era. Anyone who knows about them is already technically sophisticated enough not to make this mistake.
Someone definitely just copied a regex from StackOverflow (frankly if you actually look at the RFC a regex seems like a crazy approach).
I like the idea of adding +spam@gmail.com, but it would be really easy for this to be invalidated by just stripping this from your email before selling it in a mailing list.
Instead of blacklisting the +spam@gmail.com email, you could whitelist emails like +netflix@gmail.com. You can create a filter so that if it doesn't match the whitelist - including stripping the plus - then it will be automatically binned.

I describe this technique in my blog post[0]. I'll warn everyone now though, you'll probably want an email address for real people that you trust (like +friends@gmail.com). Also, you'll rarely have to email companies, but it is a pain if you need to do it from the +plus email.

[0]: http://iamqasimk.com/2016/10/16/absolutely-zero-email-spam/

>Also, you'll rarely have to email companies, but it is a pain if you need to do it from the +plus email.

This is absolutely true, and it's very painful. I sadly now recommend against using +plus addressing if there is a possibility you'll need to get in touch with support for a website for any reason, and I have a cautionary tale. So many websites have incredibly shitty "security features" and incredibly shitty code.

I had an account with a payment processing website with myname+website@mydomain.tld. They sent me an email requesting some additional info about a payment I was to receive in order for it to clear. I responded from myname@mydomain.tld. The automated system helpfully informed me that they can only accept email from the email in the account (argh). So I sigh, go over to the website, and change my email to myname@mydomain.tld. No luck---there's already an account with that email. OK, I might have created one before and don't remember. I try to login with this info, hoping I can delete that account, but can't seem to get the password right, and it's not saved anywhere. So I use the "Forgot Password" feature. Oops, it looks like I haven't finished the onboarding process with that account, and so I can't reset the password on it (who even thought of this?!). So I make an alias of website@mydomain.tld, change my email to that, and try responding from that alias. No luck. Turns out that you have to actually use the address they originally sent the email to. If you've changed it, oh, that's too bad---please open a support ticket with us.

It took around 7 days of back-and-forth and waiting for responses from support (lots of waiting!) to explain that I'm just trying to respond to an email they sent me, and a lot of canned responses from people completely misunderstanding what my problem is.

Would not recommend to anyone.

Thanks for sharing! You thought this one through more thoroughly than I have. And your closing line about it being impractical is unfortunately all too true.
I recently decided to ditch Gmail because I don't trust Google any more than Facebook. One nice side effect of this is that now that I'm using my own domain, all company signups can be to me@spam.domain.com, which is much harder to filter well.
I use my own domain, but still use Google for my email service.

I've had fleeting thoughts of moving away, but am pretty used to the Google's spam filtering, labelling, search, and not having to care about space or managing my own kit.

Are you DIY'ing everything?

I delayed moving away for literally years for this reason, but in 2016 I finally made the jump to fastmail and it was much smoother than I thought it would be. I don’t get any more spam than I did with gmail. I still forward my gmail to my fastmail but typically when I get a forwarded mail I either cancel whatever subscription it is or update it’s email settings, so I get rather little forwarded now.
Like dkersten, I moved to Fastmail. I used to DIY it but got tired of maintenance.
No one's saying that they aren't spec compliant, the writer is complaining that google doesn't tell its users that their inbox will include emails for more than 1 address.
When I get an email to an address that isn't exactly the same as mine (e.g. due to extra dots in it), desktop gmail does actually show a small informational link next to the receiving address saying something along the lines of "yes, this really is you; click here to learn more". I haven't seen this on mobile, although I use the Inbox app, so it's possible that this it's there on the vanilla gmail app
I was under the impression that literally anything goes on the left side of the @. It’s all up to the individual mail server
Even things at the right side of the @ are up to the E-Mail server. You can send an E-mail to @gmail.com or @GMAIL.COM and Google could be routing it differently.

It'll resolve the same in DNS, but what the user typed will be encoded in E-Mail headers, and you could route differently depending on whether it's upper-case, mixed-case or whatever.

Not true, domain part is case insensitive by the standard.

Server can decide for non-standard behavior, but that would be foolish.

The name part, per the standard, is case sensitive. However, some organizations I get email from canonicalize it to all caps - and one even removes the dots(!) and all-cap it, which for my gmail I have a filter that sends all non-dotted email to spam, since 90% of my spam is non-dotted. Since that particular email was important and I'd just happened to notice it in spam, I checked the capitalization change and called them and even managed to get through to speak with their email manager. He absolutely insisted I was wrong about it mattering and said they wouldn't change it.
Changing case could be a co's lazy way of checking for existing accounts. In a perfect world they would store the input as-is and also use a lower or upper func on the indexed col. I suppose if they are verifying emails with their users and not seeing a big drop-off in verified accounts, they probably don't care to be too exact. Removing dots from the user portion is pretty shitty though.
The domain being in any case for the purpose of delivery is part of the standard, but there's nothing I know if which would prohibit you from implementing local delivery and routing in any way you want once the mail is accepted.
You have no guarantee that the domain part case is preserved from what the user wrote, if it's not defined to be case-sensitive. So you can do what you want, but the input data are not reliable.
IIRC, not quite literally, but pretty close. The email RFCs in general tend to be much less constrained than one might expect.
Yup. Even spaces are allowed if you put double quotes around the local part: “Kevin Spacey”@example.com. It’s really surprising, there’s very little that can be verified if you strictly follow the RFC.
And comments:

    Muhammad.(I am  the greatest) Ali @(the)Vegas.WBA
(an example lifted directly from RFC-822, apparently written by a madman in love with 70s parser theory).
> there’s very little that can be verified if you strictly follow the RFC.

That's not correct. You can reliably, completely validate all email addresses . . . by sending an email to them containing information that can be used to confirm the owner's identity. All together now: the MTA is the only source of "is it valid or not" for email addresses.

This is a common issue with text based formats and ietf/RFCs. HTTP allows comments in some headers. And allows line breaks:

  H: hi
    Mom
Is the same as

  H: hi Mom
Bets on how many http clients and servers get this right? Without losing speed?

My guess is when you're not responsible for ensuring compatibility or having to deal with writing robust, fast, code, the temptation to be cute with your format overtakes things.

I expect all commonly used HTTP clients (Chrome, Firefox, IE, even libcurl) and servers (Apache, Nginx, does IIS still exist?) get this right.
I, too, would expect that the most common pieces of software with massive amounts of engineering time invested get it right (but almost certainly with a performance hit). But I'd be unsurprised to find that one doesn't handle the http parsing spec correctly somewhere.

The issue is that such a spec is entirely unneeded and overly complicated. No benefit, only downsides. And for something as simple as basic parsing rules! Not even getting into anything that should be difficult.

Yeah they're both spec-compliant, but (IIRC, as of a few months ago), in gmail at least, there's a huge difference:

janedoe+acme@gmail.com resolves to janedoe@gmail.com, and the "+acme" is simply a useful bit of metadata for Jane to track the provenance of the sender's mailing list. the "+" and anything following it are ignored. It signifies an optional suffix.

Whereas janedoe.acme@gmail.com resolves to janedoeacme@gmail.com -- a completely different address than janedoe@gmail.com. The "." is simply ignored as if it weren't there, making j.anedoeacme and jane.doeacme and janedoe.acme equivalent.

Yeah this is specific to how gmail chooses to handle these spec-compliant-though-often-mishandled characters, but anyone who works with email professionally absolutely has to come to terms with how gmail works.