Hacker News new | ask | show | jobs
by snowwrestler 1860 days ago
How are you going to satisfy data compliance, which may require the deletion of PII upon request or expiration, if your PII data is immutable?
4 comments

Was just considering this when I came across your comment.

I'm hoping someone here can suggest a one-way audit-log audit-trail sort of solution, because I need this for the medical industry.

I would say the structure of the records themselves can stay, but not the data itself.

If you have a user table, maybe you can just randomly hash the sensitive user data but keep the record.

Not 100% sure about this though, since you could probably derive the user with statistics like: if it's known that 1 person gets a specific disease every 10 years and you have an obfuscated record of a person connected with that disease, it's fairly straightforward to derive who that person is just through that connection.

The way to do it is to have foreign keys, but instead of hard delete you scrub data in the columns.
Seems like if you can scrub data, it is mutable? Maybe I misunderstand what “immutable” means.
How are you going to respond to a warranty claim if you've wrongly deleted the order data when a subject requests that you delete all PII you have about them?

https://www.adyen.com/blog/gdpr-what-it-means-for-customer-p...

You store the sales contract separately, which is what this whole subthread is about.

https://news.ycombinator.com/item?id=27251168

I'm not sure how this changes anything. Is your PII in the order forms/sales contracts mentioned above? If yes, you'll have to delete those as well anyway, right? If it's not, the order forms/sales contracts themselves don't have to be linked to something that may potentially get deleted.

Please note that by "immutable" I don't mean that data won't get deleted eventually, just that it won't be deleted until nothing needs it anymore (and until then won't be mutated either), so basically the same thing that languages like Haskell mean by "immutable". Then, once you don't need it (= it's not observable anymore), it could perhaps get archived or erased, whatever you prefer.

Then it isn’t master data anymore; it’s just one field of a record of a commercial document. This is taking a long way around to the same point.

(Presumably no-one is trying to reference-count GC their RDBMS. If so, I wish them all the luck in the world.)

I'm not quite sure what "master data" means in English (a non-native language to me) but Wikipedia tells me that it's "data about the business entities that provide context for business transactions" (and lists examples that sound relevant for this situation to me). Based on that I'm inclined to think that this would qualify.