| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by XorNot 719 days ago
	Except like, not really? If you need to remove someone's data for GDPR reasons, then "if match == userId then delete" is pretty straightforward in your log aggregation store.

2 comments

jcrites 719 days ago

Many log aggregation stores are not optimized for performing row-level updates or deletes like this. In my experience, the majority of log aggregation stores are immutable and support primarily time-based retention only.

(Though perhaps one can meet compliance needs by keeping these logs only for a fixed maximum period of time, e.g. 30 days, and keeping only appropriately anonymized data longer.)

link

wongarsu 719 days ago

Saying "we need to keep these logs for 30 days to allow us to troubleshoot problems. We can't reasonably delete them sooner, but they get deleted after 30 days" is a valid way to comply. You have a justifiable reason to keep them, the interval is reasonably short, and you have good technical reasons not to do it faster.

If your internal compliance people don't like it you can also rephrase it as "we are removing the data starting right now, the procedure takes 30 days". You have one month to even respond to removal requests, and can stretch that by another two. As long as you are not intentionally causing delays these are perfectly reasonable time frames.

Of course you still have to do all the other stuff for GDPR compliance, like making sure you have rules who gets access to the log system instead of just giving it to the entire company, making sure you store to an encrypted drive, etc.

link

hipadev23 719 days ago

A log aggregation store that can’t handle deletes in 2024 is a product that shouldn’t be utilized. GDPR and similar redaction laws are not new.

Efficient or fast is not a requirement for GDPR, so it can happen slowly and in the background just fine.

link

randerson 719 days ago

A log aggregation store that can handle deletes is a security and compliance problem. Try proving to an auditor that a hacker couldn't have hacked in and then covered their tracks by deleting the logs.

link

hipadev23 719 days ago

That’s an incredibly weak response. Laws you can’t fuck with, auditors can fuck off. I’d love you trying to explain to the EU why you’re violating their laws because some auditor wanted to check a box. I sure hope your auditors are assuming legal responsibility.

link

randerson 719 days ago

Don't log anything you're not allowed to log. But in some industries (like finance) you need an immutable logging system and if you could easily delete evidence of a crime or security breach that would be a bug not a feature.

link

thanksgiving 719 days ago

I don’t understand this… what if we had no logs?

link

randerson 719 days ago

I should have mentioned this is really only an issue if your business has regulatory requirements that necessitate tamper-proof logging.

link

laerus 719 days ago

what if you anonymize the actual user entity with that user id instead? even if you have that user id in your logs the name or any sensitive field would be something like 'GDPR says HI".

link

danpalmer 719 days ago

This is necessary but not sufficient. Logs can contain other data, that could be used to narrow down the user base enough that you could guess which user it is, and now from just the logs you have de-anonymised an ID and can see everything that user did, or likely did.

In reality you need multiple different steps here: anonymous IDs, well-defined reasonable retention periods, strong access control and audit logging, and a privacy policy that says why the data is collected (for service quality typically) and how/when it will be deleted.

There's no one-clever-trick to GDPR, the law was intentionally designed to require businesses to apply holistic best practice. Whether it has done that well or not is another matter, but that was at least the aim.

link

llamaLord 719 days ago

Exactly! The logs only need to hold ID's that you can correlate back to hydrated data later.

GDPR request comes in, just delete the record the ID refers to and you're done.

link

unscaled 719 days ago

This is not enough.

First, as another reply above has mentioned, other data in the logs (such as IP address, list of friends, browser fingerprint) can be used to de-anonymize the pseudonymous ID.

Second, GDPR makes it quite clear (for the reasons above) that pseudonymized data, is still considered personal data. Pseudonymization reduces the risks, but does not remove them entirely. It should generally be combined with other measures such as encryption.

link