Hacker News new | ask | show | jobs
by Cieplak 2757 days ago
Is it possible to comply with GDPR while using this to store data? Given that it operates like an append-only log, is it possible to actually remove data to comply with a GDPR request?
8 comments

You can use cryptoshredding: have an encryption key for each user (stored outside of this ledger) and encrypt all PII with that key. Throw away the key if the user wants you to delete their data.
But then you must also plan for what happens when that encryption is broken. So I think you also need to control and protect your storage in order to make that a safe strategy.

The more I think about these things, the more I distrust cloud providers, and want my own hardware.

Do you really trust these companies enough to hand them the keys to all your data? Is there really any way to provide secrets to your app without trusting the hosting provider?

If you only care about legal liability, cryptoshredding is generally recognized as an effective measure for secure deletion.
Fuck players who operate like that. Slater Systems will always protect its users at all cost.
What if the key leaked before you have thrown it away?
That's a good question!

If your keys leaked, you'd probably have to assume you lost all of the data up to that point. To secure the data going forward, you'd need to generate a second key per user for all of the future data. Well, and hopefully shore up the security problems!

I agree, though, that an immutable ledger like this complicates things in a way that you-shouldn't-mutate-but-can datastores do not.

I think it's worse than just losing the data. If you operate a public cryptography ledger with users data in EU and do it under some company name, you won't be able to comply with the "right to be forgotten" or how it's called.

I'm currently working on this problem in application to blockchains. The plan ATM is to implement cryptographic snapshots of the data, where the old transactions are erased but their proof is available.

It's almost like regulations on remembering are a bad idea...
Probably the same as when the actual data is leaked.
key rotation, disclosure, generation, storage, escrow, regulatory jurisdictions - there are a lot more issues than what you mention.
By not storing that type of data. It's you need to store that type of data you can also anonymize or turn into keys where you keep the answers in a separate (mutable) database.

Also, for some purposes (legal) you are allowed to store the data regardless because you have to for other reasons.

But what about successors of GDPR? What if it becomes illegal to harbor certain information that already exists in the ledger?
Seeing as it's immutable, it would appear not.

Which is why you use it for specialised use cases and keep any PII out of there.

It would be possible to replay into a new ledger, filtering out the pieces of data to be deleted, but that goes against having an immutable log in the first place.

It's easy enough to store sensitive data externally (e.g., in a key value store) and simply store a reference to the data along with its hash in the ledger. When data needs to be removed, delete the data from your KV store and add an entry to the ledger noting that it was removed.

But you probably wouldn't store sensitive user data in this kind of database anyway. Not ever use case is well-suited for a ledger like this. In most applications, this would be pointless overhead.

Your data modelling has to be GDPR compliant not your database.
How do you delete user data from an immutable store? You get into cryptography at that point and then some edge cases make it not so simple.
There are two parts of every PII storing system. The actual PII store which is super small, "mutable" with your terminology, locked down so nobody can access it without raising an alarm and usually not accessed at all except for some very limited use cases, including GDPR ones. The rest of the store just uses references to the entities sitting in the GDPR store, like a numeric id (foreign key in SQL terminology). This way any data store, SQL, datalake, etc. can be easily GDPR compliant without needing to delete data in the large data stores and this also increases security because in case of a security breach to the data stores the GDPR data cannot be accessed.
If you tie a user to a uuid separately from where you are logging the transactions, you can nullify the existing UUID link to the given user and be in full compliance with GDPR.
The point is that you never put it in an immutable store in the first place if there's a chance that it would need to be removed later.
GDPR "Right to Forget" has an exception clause that defers to regional Accounting compliance laws, such as retaining a credit card transaction for 5 years.
you probably would want to store the reserved data in a separate store, and just reference them from the immutable one.
Ha! Somebody had the same doubt I had!