Hacker News new | ask | show | jobs
by eric_h 2995 days ago
I was under the impression that literally anything goes on the left side of the @. It’s all up to the individual mail server
2 comments

Even things at the right side of the @ are up to the E-Mail server. You can send an E-mail to @gmail.com or @GMAIL.COM and Google could be routing it differently.

It'll resolve the same in DNS, but what the user typed will be encoded in E-Mail headers, and you could route differently depending on whether it's upper-case, mixed-case or whatever.

Not true, domain part is case insensitive by the standard.

Server can decide for non-standard behavior, but that would be foolish.

The name part, per the standard, is case sensitive. However, some organizations I get email from canonicalize it to all caps - and one even removes the dots(!) and all-cap it, which for my gmail I have a filter that sends all non-dotted email to spam, since 90% of my spam is non-dotted. Since that particular email was important and I'd just happened to notice it in spam, I checked the capitalization change and called them and even managed to get through to speak with their email manager. He absolutely insisted I was wrong about it mattering and said they wouldn't change it.
Changing case could be a co's lazy way of checking for existing accounts. In a perfect world they would store the input as-is and also use a lower or upper func on the indexed col. I suppose if they are verifying emails with their users and not seeing a big drop-off in verified accounts, they probably don't care to be too exact. Removing dots from the user portion is pretty shitty though.
The domain being in any case for the purpose of delivery is part of the standard, but there's nothing I know if which would prohibit you from implementing local delivery and routing in any way you want once the mail is accepted.
You have no guarantee that the domain part case is preserved from what the user wrote, if it's not defined to be case-sensitive. So you can do what you want, but the input data are not reliable.
IIRC, not quite literally, but pretty close. The email RFCs in general tend to be much less constrained than one might expect.
Yup. Even spaces are allowed if you put double quotes around the local part: “Kevin Spacey”@example.com. It’s really surprising, there’s very little that can be verified if you strictly follow the RFC.
And comments:

    Muhammad.(I am  the greatest) Ali @(the)Vegas.WBA
(an example lifted directly from RFC-822, apparently written by a madman in love with 70s parser theory).
> there’s very little that can be verified if you strictly follow the RFC.

That's not correct. You can reliably, completely validate all email addresses . . . by sending an email to them containing information that can be used to confirm the owner's identity. All together now: the MTA is the only source of "is it valid or not" for email addresses.

This is a common issue with text based formats and ietf/RFCs. HTTP allows comments in some headers. And allows line breaks:

  H: hi
    Mom
Is the same as

  H: hi Mom
Bets on how many http clients and servers get this right? Without losing speed?

My guess is when you're not responsible for ensuring compatibility or having to deal with writing robust, fast, code, the temptation to be cute with your format overtakes things.

I expect all commonly used HTTP clients (Chrome, Firefox, IE, even libcurl) and servers (Apache, Nginx, does IIS still exist?) get this right.
I, too, would expect that the most common pieces of software with massive amounts of engineering time invested get it right (but almost certainly with a performance hit). But I'd be unsurprised to find that one doesn't handle the http parsing spec correctly somewhere.

The issue is that such a spec is entirely unneeded and overly complicated. No benefit, only downsides. And for something as simple as basic parsing rules! Not even getting into anything that should be difficult.