Hacker News new | ask | show | jobs
by stingraycharles 1588 days ago
How is encryption compliant? I’ve implemented GDPR data infrastructures twice now, and as far as I’m aware, the only way to be compliant with encryption is when you throw the decryption key away.
2 comments

Sometimes it might be a single field in a 1MB nested structure that you have to remove. So it gets encrypted when the whole structure gets stored and when the field is to be deleted you just throw away the key instead of modifying the entire 1MB just to remove a few kB.
If you're comparing gov't regulations to delete data to saving a few KB, then I think you're looking at this wrong.
It's few KB per-record. In practice when schemes like that are applied, it means "in total we can remove this key and not rewrite 10M rows across 3 data stores which itself would cost $$$ and make the database and incremental backups cry".
Bingo.

We did a similar thing except replacing the values with a UUID and storing the pair in a lookup table somewhere. Delete that row and none of the rest of the data is able to be tied back to a human being.

Bonus, most people didn't need that data, and it was no longer given out to everyone who grabbed the entire dataset.

As mentioned, encrypt something and throw a way the key, often called "crypto shredding".
Ahh I see, and that way you can quickly “remove” a whole lot of data by just removing the key, which makes for cheap operations, and/or more flexible workflow (you can periodically compact the database and remove entries for which you have no key).

Is my understanding correct?

yes, but also its that a lot of the data these days ends up in pseudo-append-only stores (like s3/glacier, or many data warehouse products) where deletes/updates to old data are extremely expensive. Or just having to scan petabytes of cold stored data looking for a particular users records. Throwing away the key is instant and "free".
Interesting... this raises soooo many questions.

How are "crypto-shredding" actions propagated to the access patterns/layer?

I assume that there is an encrypted partition/cluster/shard key (in addition to similarly encrypted rows/fields) that is invalidated during the shredding causing any predicate matching on these ids to evaluate to false.

---

Now that I've typed this out, i realize that by electing to encrypt individual fields, all and any predicate matching will evaluate to false and has nothing to do with partitioning, sharding, or clustering...

I guess it would also be pretty awesome since you could invalidated entire sets of data by "shredding" grouping ids that are being used as partition/cluster/shard keys.

Now I realize that this implies that you shouldn't encrypt each and every fields of related data the same way (grouping ids), otherwise you're potentially going to end up with unique keys/ids for common attributes across sets of data... potentially rendering clustering/sharding/partition useless (cardinality too great).

While "defragging" or "rebalancing" this increasingly "sparse", old data would be expensive, surely there has to come a point where the storage costs start to exceed that of interaction costs for specific subsets of your prefixes. For instance, partitions that consist entirely of data that has had all of its respective encryption keys shredded.

---

Illuminating comment that has set my mind into overdrive... Fascinating stuff!

That doesn't sound like something jeff_vader was talking about, since "deleting any old data was rejected" and this is definitely a way of deleting stuff.