Hacker News new | ask | show | jobs
by morgante 2989 days ago
The scope of personal data is disastrously large and the guidance is fuzzy at best.

Take, for example, my old blog. It has commenting enabled and a standard Apache config (where logs include IP addresses). If I want to comply with GDPR, I have to do a bunch of work around log rotation/encryption, provide tools for old commenters to go back and remove their information, and this is even the simple case that I'm not using any 3rd-party analytics.

No part of my "business model" is attempting to profit from personal data yet I have to jump through a bunch of new hoops.

My likely solution for projects is to simply block EU traffic going forward.

4 comments

IP addresses aren't PII. If you're capturing IP + real name, or similar (email + real name) then AIUI you'll need to tell people on request who you sell that info to and allow removal.

Assuming it's a personal blog then just don't capture any PII. Don't sell it, be prepared to delete a user's comments on request. Don't capture PII without informed consent.

Easy, no?

> IP addresses aren't PII.

I personally think so, but everything I've read about GDPR says they usually now are considered in scope.

Deleting comments is non-trivial. How do I verify that the person requesting deletion is the original commenter? How do I then wipe out every mention of their IP address from all my logs?

These are easily solvable questions for large companies, but overheard for small startups and personal projects.

> be prepared to delete a user's comments on request.

Or, just block users from EU from commenting. I can see the win for the Internet here.

IP by itself is not considered private. It's only when you attach it to other identifying data. Anonymous comments are not covered with GDPR.
> Anonymous comments

Wordpress asks for your name and e-mail to post a comment, doesn't it?

I guess the tuple (ip,name,email,comment_text) is PII?

Name is, email is, IP combined with either (or both) is.
However, is it not thought that because the ISP keeps a log of dynamic IP addresses, these could (in theory) be matched to the IP address of anonymous comments, thus de-anonymise them?
No, because you need to take into account the effort needed to de-anonymise the IP address.

> > (26) The principles of data protection should apply to any information concerning an identified or identifiable natural person. Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person. To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments. The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

This article makes a compelling argument that it could be: http://privacylawblog.fieldfisher.com/2016/can-a-dynamic-ip-...

IANAL, but I'd be wary of saying that you'll be fine storing dynamic IP addresses. You'll probably need to have a rationale as to why you don't consider it.

> Anonymous comments are not covered with GDPR.

There is no guarantee that comments stay anonymous. Commenters can, and do, enter their real name as their display name.

For Apache can't you just change LogFormat to exclude IPs and delete the old logs?
Yet, you're still collecting it, and it doesn't seem like you're taking steps to protect it.
Because I fundamentally don't think a random foreign entity should dictate how I manage logs on my personal blog. It's challenging enough to debug issues without having IP issues.

I don't even consider a random IP to be PII.