Hacker News new | ask | show | jobs
by KingOfCoders 1401 days ago
I personally wish companies would encrypt email addresses in their database, this would at least help against SQL injection attacks and some others (e.g. attacker has only DB system access and not app server access), so it's more difficult for attackers to aggregate data on me. To me it feels very casual waving away the leak of email addresses and just give the usual "passwords were encrypted". But YMMV.
4 comments

The difference between email and password is you can validate a password with a hash, but you can’t send an email to a hashed address. Their db may be encrypted at rest, but a hacker could still compromise a system that has the key in memory.
Encrypt the email in column, add hashed email in separated column. Email Sending would then be covered by a separated and "airgapped" system that holds the decryption key, if you need to send mail, you send the encrypted email address plus what you want to send there.

Now an attacker cannot get a hold of email addresses easily.

This is a great idea. You could use public key cryptography too, so that the system adding emails to the db doesn't need the private key.

3rd party mail sending services could support this by generating a keypair on their systems, and only giving you the public half. When you make an API request to send an email, you provide only the encrypted version of the address.

Edit: The hashing is an issue. It's too easy to build a wordlist of possible addresses, to crack the hash. I think this can only work if you drop the hash column, and instead require users to log in using a username.

The hashing is an issue but you need to identify the user somehow when you do things like password resets.

The alternative is to handle everything by a username and password resets also use the username (which would be fine, worst case you get spammed by PW reset mails).

Though of course you can also combat this by making the hash particularly expensive and salt it. Simply take a SHA3-512 of the email address a few thousand times, take the first 12 bits and use that to identify a set of 4096 records. Now the full email is simply an application of Blake2sp, which you calculate in parallel for all 4000 records.

Adjust the 12-bit barrier so that it represents a decent sized chunk of users, lower would mean less load on the login service, higher would mean better anonymity. Instead of SHA3-512 you could also use a bloom filter to find out if a set of records contains the email or not, with the added bonus of being probabilistic.

You could also ditch Blake2sp for a simple round of salted SHA3-512. The fact that you salted it makes dictionary search insanely annoying already.

That's a simple and brilliant idea. I'm running with it.
How would you tell the airgapped system what to send?
I used quotation marks on purpose, it is of course networked, but would be using different credentials to other systems and have a ingest-only API endpoint to issue mails with.
> a hacker could still compromise a system that has the key in memory.

Security is about layers. Simply because a hacker “could” do something, does not mean it’s a bad idea. Getting the encryption key when it’s not stored in the database requires the hacker to now have access not to just the database but to another system as well.

This is an excellent point, but there's nuance to it.

This seems like an acceptable solution for email and a lot of other PII. However, if you were to propose the same thing for passwords, with the same argument, I'd be dead against it -- even beyond the total lack of need for the system to ever have the actual password. I'm not quite sure how to explain this, though.

There’s no reason a company needs to know your password. But they do need to know a way to contact you.
Invariably some developer would just store the key in a column next to the email address so they could process any transaction directly in the query.

But the hackers would have to know what algorithm was used :) That's a layer, right?

> some developer would just store the key in a column next to the email address

I think that depends on where you work. Process. Code reviews before allowing merge/pull requests can help.

In the healthcare industry in USA, Personal Identification Information (PII)/Personal Health Information (PHI) needs to be encrypted at rest and in transit and is mandated by law. So, they are required to encrypt PII/PHI data fields.

Some of those practices may be generally applied for non-healthcare settings as well.

To get nitpicky... (usual disclaimer, IANAL but I worked in health IT including heavy involvement in HIPAA topics earlier in my career) I don't think there's a requirement under HIPAA or HITECH to use encryption.

The relevant parts of HIPAA are the duty to not disclose PHI to unauthorized recipients and breach notification requirements if you do incorrectly disclose PHI (the HIPAA breach notification rule).

The magic of encryption is that HIPAA provides safe harbor if the data stolen/lost/intercepted was encrypted to certain standards. So if you lose an encrypted hard drive full of PHI, or someone breaks into your servers and steals encrypted data but not the decryption capability, then it's not considered a breach under HIPAA and you do not need to notify anyone.

Tons of PHI isn't stored encrypted at rest. Physical theft of the hard drive from the practice's back-end EHR database server hasn't generally been high priority on the HIPAA breach potential risk assessment list. But nearly all data in transit, on employee laptops, etc. will be encrypted, because that's where you want the safety net of the safe harbor provision.

You are right. The law mandates reasonable safeguards and one of them is encryption at rest/motion when deemed necessary by the covered entity (which is quite common in Healthcare).

From the HHS site: https://www.hhs.gov/hipaa/for-professionals/faq/2001/is-the-...

> Is the use of encryption mandatory in the Security Rule?

> Answer:

> No. The final Security Rule made the use of encryption an addressable implementation specification. See 45 CFR § 164.312(a)(2)(iv) and (e)(2)(ii). The encryption implementation specification is addressable, and must therefore be implemented if, after a risk assessment, the entity has determined that the specification is a reasonable and appropriate safeguard in its risk management of the confidentiality, integrity and availability of e-PHI. If the entity decides that the addressable implementation specification is not reasonable and appropriate, it must document that determination and implement an equivalent alternative measure, presuming that the alternative is reasonable and appropriate. If the standard can otherwise be met, the covered entity may choose to not implement the implementation specification or any equivalent alternative measure and document the rationale for this decision.

Does at-rest mean: encrypted on storage so noone can physically steal a drive or encrypted in the database so noone can get the information with SQL without the key (e.g. Postgres column encryption)?
Conceptually, yes. You can encrypt at the database/filesystem level (where the OS/DB engine manages the encryption keys and enforces access control), at a table level/column level (where the db engine enforces access control) or at the application level (where the application manages the encryption keys and they are separate from the database engine).

They serve different purposes. For eg: When a disk drive is faulty and thrown away, you may not want data to be recoverable from it. So, the filesystem level encryption helps there. A db/table/column level encryption helps when there are different applications (eg: transaction processing and analytics) accessing a shared database. Reporting queries may not need access to the sensitive fields whereas certain transaction processes may need it. In this case, db/table/column level encryption helps. When you want separation of concerns, you can add application level encryption (on top of the other two). Example: Your data is stored on the cloud and you don't want the cloud service provider to know the data or if they replace a disk drive as part of normal servicing, you don't want your data to leak.

This depends on the threat model.

Apple did a good job with this, I think it’s called Apple Hide My Email.

https://support.apple.com/guide/icloud/what-you-can-do-with-...

It seems to me that encrypting emails is either untenable or insufficient depending on how you do it. You could do a one-way operation like is used on passwords, but then you can't access the user's email address to send them emails. You could instead do a two-way encryption but that likely means using a hardcoded key to decrypt, and that key can't be considered secure if attackers have access to the system. There may be other more effective options but I'm no security expert and I haven't given much thought to other solutions.
You're mostly right.

Hardcoding a key would be a bad idea. You would need some way to rotate keys. Maybe also encrypt the actual data encryption keys under another key encrypting key.

But this only defends against attacks which can't get that key (e.g. a SQL injection attack that just dumps table contents).

Having said that, you only need to decrypt if you want to send an email, for logging in you could just store a one way salted hash.

More importantly, this is a lot of effort to protect data that isn't usually regarded as that sensitive (unlike the passwords). If I had the security budget to do that, I'd almost certainly spend it on something else.

If that 2-way encryption key is stored separately to the database (e.g. only the web server has it, not the database server), it certainly helps reduce the risk that the emails are compromised.