Hacker News new | ask | show | jobs
by whs 736 days ago
My company need deterministic encryption to search encrypted data.

Turns out the people who wrote the in house Go library didn't have any idea. There is no non-deterministic encryption function because that might be too complicated for non-senior engineers (afterall they wrote most of the actual application) to correctly choose.

The first version use AES-CFB. There's no authentication. It's probably copy pasted from a public Gist and nobody ever commented on it that it is insecure. I wonder if it was actually intended to be the non-deterministic version, but the higher level wrappers do not wrap this function so people didn't actually use it.

The second version use AES-GCM with nonce derived from the key and AD. Since nobody understand why AD is needed, AD is always nil. Essentially there's ever one nonce.

I think the problem is that many senior engineers know that encryption use "AES" library but the Go standard library doesn't tell you how to use it securely.

Surprisingly this mistake also happen in our Java stack that was written by a different team. A senior engineer did notice and quietly moved away from the vulnerable version without telling the Go version.

I wrote a POC to decrypt data of the Go version, then wrote the third version, perhaps it will be open source soon. The new library only implement envelope key management, encrypted string wrapper and ORM integration. The rest is Google's Tink.

2 comments

> My company need deterministic encryption to search encrypted data.

I'll take things you should never do as a non-expert for $100.

> The first version use AES-CFB. There's no authentication. It's probably copy pasted from a public Gist and nobody ever commented on it that it is insecure. I wonder if it was actually intended to be the non-deterministic version, but the higher level wrappers do not wrap this function so people didn't actually use it.

Lack of authentication is probably the least of your concerns if your product is searching over encrypted data.

You use the AD to authenticate additional information that doesn't need to be encrypted. For example, if you separately encrypted every record of a database, you could leave a non-sensitive identifier exposed along with each of them and validate it as the AD when decrypting. This would allow you to find specific records quickly assuming you also had an (encrypted) index or some prior knowledge. As with any case of leaving some data exposed, this can open up certain avenues of attack depending on the threat model. If the data can be tampered with, for example, this isn't a good idea since an attacker can corrupt your database (you'll know, but it will be unusable).

[Edit: I was unaware of the existence of "deterministic AEAD" before I wrote this: "Deterministic" encryption is discouraged because it passes through block-aligned patterns in the plaintext to the ciphertext. There is a simple method to do what you're after: it's just feeding your data (with padding) directly into the cipher (so-called ECB mode). Go's standard library gives you the raw AES cipher to do this with, but it doesn't expose the standard padding mechanisms (and it's not authenticated). You should be aware that doing anything like this leaves your data open to certain kinds of cryptanalysis that can infer the plaintext without directly breaking the cipher.]

I largely agree that the standard library doesn't provide any solid guidance or higher-level APIs for any use case other than TLS. The implementations seem to be pretty high-quality but you quickly go from "it's hard to use this wrong" in some libraries to "here's a drawer full of sharp knives" in others.