Hacker News new | ask | show | jobs
by kitchenkarma 2522 days ago
This is very weak reasoning, because you cannot identify an individual by IP either. This project looks like trying to exploit loopholes. The idea behind GDPR is to make sure companies log only data they need. This project looks into logging the data but without expressing why this is even necessary. Therefore I don't think this is compliant with GDPR.
3 comments

> because you cannot identify an individual by IP either

Yes you can, particularly if you correlate across different websites.

You are conflating identification of a person by behaviour analysis with matching an ID. What is the ID is irrelevant here - may as well be a hash. That just proves my point that this project is not compliant.
I remember when I learned that IP was considered personal information, I was shocked. But I thought about it and it does make sense.
GDPR is for protection of personal data and we store no personal data. Please take a read of this: https://usefathom.com/data/
I don't believe you have understanding what personal data and GDPR is. You are capturing user behaviour and that is very personal regardless if it is "anonymised" or not - and that is without clear need for doing that. That is pretty much against GDPR.
You come across as somewhat hostile but I'm going to assume good intent on your part, so thank you for the challenges on our stance.

So if you take a look at Recital 26 (https://gdpr-info.eu/recitals/no-26/):

> To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.

> To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

> The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.

> This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

So the piece about the principles of data protection not applying to personal data rendered anonymous is crucial. We believe that GDPR does not apply to us because of that. But even if GDPR did apply to us (we'll assume it does, that's always the best way to be), then our legal basis is that there's legitimate interest. As a website owner, it is in your legitimate business interest to understand how your website is performing - e.g. the most popular pages, the pages where people linger for longer, the pages where people bounce.

Article 4 (1) states:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

- a hash number falls into this. You cannot just quote recital 26 and stop reading since you found it fitting. Recital 30 covers the case for "other identifiers" that might replace cookies. No hard feelings we all do.

The data might be anonymous for a third party but if you can single out just one person or in other word one unique visitor it is not anonymously. NB. One IP poisons the whole data.

So your fallback is Article 6(f) which is reasonable but you can not assume the interest of a site owner is always higher than the interest of the visitors. You have to put your arguments into writing and have the means for people to appeal it. 6f is not meant as a blanco cheque or batch job...

> Online identifiers for profiling and identification > Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. > This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.

The thing is, you can't create profiles. So right now I could give you a single entry for our website

> NULL "" "https://www.usefathom.com" "bb9377f4cf33093765835a48e962a5dbd3168499abd12b120c8c118c86c41479"

How could we possibly use that to profile / identify? The hash (bb9377f4cf33093765835a48e962a5dbd3168499abd12b120c8c118c86c41479) is unique in the database table and never repeats.

I hear you. We don't rely on Recital 26 to comply with GDPR. I've not had the Recital 26 piece confirmed by a lawyer but it's a personal hunch / exploration. Hearing your comments on Article 30 are helpful, thank you, would like to hear your thoughts on my reply if possible :)

If you dont need the identifier, why don't leave it out at all (Art 5c)? Or is just in it this case unique?
They don't have any PII and are therefore not subject to GDPR. They have data that, if it were not anonymized, would be PII, but it's anonymized and therefore isn't.
It doesn't matter if it is an IP or another identifier e.g. a hash. Person can be identified by behaviour and this is not anonymised.
How can a person be identified from a hash by behaviour? We built the software but you seem to know something we don't...