Hacker News new | ask | show | jobs
by Mashimo 3031 days ago
Can't test the product they try to promote because emails with a `+` in them are not valid.
2 comments

Email validation regexes are so annoying. Everyone ought to just use .+@.+ as their validation regex and not be more strict than that.

Beyond that just queue and try to deliver the email. Tell the user than an email should arrive shortly and that if it doesn’t they should check their spam folder and that they should check that they gave the correct email address. When you say this you repeat the email address that the user gave you (escaped for XSS of course).

I think some people “validate” against a strict pattern to keep their users from mistyping, but really there are so many ways to make a typo and still match those regexes that IMO it’s pointless to use a complicated regex and 80% of the times those regexes end up rejecting actually valid (though unusual) email addresses.

I think for a lot of developers the reason they do this is that they’ve learned that they should validate data and so they decide to validate email and to do so they either copy-paste some random-ass regex off the internet or they write their own broken regexes.

All your regex should do is to ensure that there is an @ in the address and that there is something before and something after. This keeps people from mistakenly entering say for example their phone number because they didn’t read what the field was for.

To prevent people from making your machine send your emails where it should not, such as to root@localhost of your server or elsewhere on your local network (don’t know why anyone would and also it wouldn’t be a big issue, just a tiny bit annoying), is a server configuration concern. Specifically, a concern of configuration of the email server software and of your firewalls.

User presses sign up -> Send then to registration form, they fill in their details which you validate lightly client side, they submit -> You validate lightly server-side and either send them back to the form or on to the next step -> You tell them “Thank you, your registration is now complete. An email should arrive in your inbox shortly. If it does not, please check your spam folder and also control that you entered your email address correctly. The email address you gave us was somebody@example.com.”

I could see validating the domain after the @, but yes, delving into the username portion of an email address... that way madness lies.
Sorry, but .+@.+ isn't going to cut it if you want to confidently accept deliverable email addresses. Regex valid, but not email valid:

codetrotter@example

code@trotter@example.com

code trotter@example.com

codetrotter@example..com

codetrotter@example.com.

.codetrotter@example.com

My company runs a website that has elderly people signing up for newsletters. The client is paranoid about not getting every last drop of possible data, and raises hell for every email address that isn't deliverable. There are LOTS of ways to easily ruin a simple email address.

There is a other email that the regexp lets pass and that’s still not valid:

codetrotter@example.com

It conforms to the expectated format and could be a valid email, but it’s actually not because no such user exists.

An email might also exist, but not accept mail from you.

The given email address might exist, by could belong to another user.

There’s a million things that can go wrong and you’ll have a very hard time catching them.

The only way to identify if an email is valid and accepts mail is to actually send an email there.

You can, as a help for the user identify odd looking email addresses and flag them in the UI (“this looks unusual, are you sure”), but generally speaking, chances are high that any strict validation will reject real world addresses while not catching all errors.

Good points. Why do regex tests at all if emails could fail in any number of ways?

If you're going to test for at least 3 characters with a @ in the middle, you probably should implement some other simple rules to have a snowball's chance on the internet:

only one @

no spaces

has TLD (guarantee at least one period after @, and something else after, no consecutive periods)

can't begin or end with a period

Your “has TLD” test is already wrong: localpart@tld (example@de) is an odd, but valid address.
A (very simple) regex text might exclude some randomly-entered garbage or inadvertantly invalid address. Even then, the scope and reliability of such tests is minuscule.
It appears you're missing the main point of the parent post.

You cannot validate an email address.

You can make a basic excruciatingly simple test for proper form. And should probably limit your checks to that.

For all else, attempt to use the email address provided for validation within your onboarding loop with a sufficiently unique verification URL or code. If that succeeds, the address is ... still not absolutely certainly valid, as it may have gone to a third party who proceeded to verify it. But at least it delivered to somebody.

See: https://hackernoon.com/the-100-correct-way-to-validate-email...

An email address being syntactically valid is no guarantee that the mailbox even exists or is correctly mapped to the person at the keyboard!
If you want to "confidently accept deliverable email addresses" the only solution is to send a test message out with a link that the recipient has to visit to validate that the email address exists. Otherwise what about regex and email valid addresses that still don't exist?
codetrotter@example is a valid email
I should have been more specific: an email address that is routable over the internet. Where's the TLD on that?
What section of what standard says a TLD can't have an email server on it? Is AAA not allowed to host an email server on `aaa`[1], and have the email `sales@aaa`?

[1]: And "aaa" is a valid TLD; see the full list: https://www.iana.org/domains/root/db ; now, perhaps it is required to at least have a second level domain, but that's what I'm asking: is an MX record invalid on a TLD?

It’s actually a valid email and I’ve seen examples of such emails, but can’t remember the exact specifics.
> emails with a `+` in them are not valid

Wrong: https://en.wikipedia.org/wiki/Email_address#Syntax

(I know this is of little help if your email app doesn't allow them. I just want to point out that the standard does allow them.)

I think what Mashimo means is that Infoteam.ch thinks that `+` is invalid when it isn't. Sadly they're far from the only ones whose email validation code won't accept the plus character, or many other legal characters in the local part of an email address.