Except like, not really? If you need to remove someone's data for GDPR reasons, then "if match == userId then delete" is pretty straightforward in your log aggregation store.
Many log aggregation stores are not optimized for performing row-level updates or deletes like this. In my experience, the majority of log aggregation stores are immutable and support primarily time-based retention only.
(Though perhaps one can meet compliance needs by keeping these logs only for a fixed maximum period of time, e.g. 30 days, and keeping only appropriately anonymized data longer.)
Saying "we need to keep these logs for 30 days to allow us to troubleshoot problems. We can't reasonably delete them sooner, but they get deleted after 30 days" is a valid way to comply. You have a justifiable reason to keep them, the interval is reasonably short, and you have good technical reasons not to do it faster.
If your internal compliance people don't like it you can also rephrase it as "we are removing the data starting right now, the procedure takes 30 days". You have one month to even respond to removal requests, and can stretch that by another two. As long as you are not intentionally causing delays these are perfectly reasonable time frames.
Of course you still have to do all the other stuff for GDPR compliance, like making sure you have rules who gets access to the log system instead of just giving it to the entire company, making sure you store to an encrypted drive, etc.
A log aggregation store that can handle deletes is a security and compliance problem. Try proving to an auditor that a hacker couldn't have hacked in and then covered their tracks by deleting the logs.
That’s an incredibly weak response. Laws you can’t fuck with, auditors can fuck off. I’d love you trying to explain to the EU why you’re violating their laws because some auditor wanted to check a box. I sure hope your auditors are assuming legal responsibility.
Don't log anything you're not allowed to log. But in some industries (like finance) you need an immutable logging system and if you could easily delete evidence of a crime or security breach that would be a bug not a feature.
what if you anonymize the actual user entity with that user id instead? even if you have that user id in your logs the name or any sensitive field would be something like 'GDPR says HI".
This is necessary but not sufficient. Logs can contain other data, that could be used to narrow down the user base enough that you could guess which user it is, and now from just the logs you have de-anonymised an ID and can see everything that user did, or likely did.
In reality you need multiple different steps here: anonymous IDs, well-defined reasonable retention periods, strong access control and audit logging, and a privacy policy that says why the data is collected (for service quality typically) and how/when it will be deleted.
There's no one-clever-trick to GDPR, the law was intentionally designed to require businesses to apply holistic best practice. Whether it has done that well or not is another matter, but that was at least the aim.
First, as another reply above has mentioned, other data in the logs (such as IP address, list of friends, browser fingerprint) can be used to de-anonymize the pseudonymous ID.
Second, GDPR makes it quite clear (for the reasons above) that pseudonymized data, is still considered personal data. Pseudonymization reduces the risks, but does not remove them entirely. It should generally be combined with other measures such as encryption.
(Though perhaps one can meet compliance needs by keeping these logs only for a fixed maximum period of time, e.g. 30 days, and keeping only appropriately anonymized data longer.)