Hacker News new | ask | show | jobs
by microtonal 1217 days ago
From the Homebrew documentation:

A Homebrew analytics user ID, e.g. 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB. This is generated by uuidgen and stored in the repository-specific Git configuration variable homebrew.analyticsuuid within $(brew --repository)/.git/config. This does not allow us to track individual users, but does enable us to accurately measure user counts versus event counts. The ID is specific to the Homebrew package manager, and does not permit Homebrew maintainers to e.g. track you across websites you visit.

IANAL, but an UUID is definitely PID under the GDPR:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

Also see recital 30:

Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.

The GDPR doesn't only take into account whether an identifier can currently be used to identify a person, but also whether the data can be correlated in the future to do so (e.g. by correlating package installs with visiting project websites, thus deanonymizing the UUID).

The only safe way to abide by the GDPR is to avoid storing any non-essential data without consent.

I am pretty sure that Homebrew have been violating the GDPR for years by making analytics opt-out. Sadly, anyone who tries to warn them gets banned from their issue tracker.

1 comments

Read your own cite: nothing about the UUID in question is associable with an identified or identifiable natural person, which is what the GDPR concerns.

We do not have the ability to correlate your package installs (again, we do not know what you install) with your browsing history, and we do not store any information that would allow us (or an adversary) to do so.

Read your own cite: nothing about the UUID in question is associable with an identified or identifiable natural person, which is what the GDPR concerns.

This is false and a misunderstanding of the GDPR. It is not about whether it is currently possible. But whether it would be possible if it was correlated with other data.

What differs pseudonymisation from anonymisation is that the latter consists of removing personal identifiers, aggregating data, or processing this data in a way that it can no longer be related to an identified or identifiable individual. Unlike anonymised data, pseudonymised data qualifies as personal data under the General Data Protection Regulation (GDPR). Therefore, the distinction between these two concepts should be preserved.

https://edps.europa.eu/press-publications/press-news/blog/ps...

‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;

https://gdpr-info.eu/art-4-gdpr/

So, basically if we have a data set with three columns:

Personal name, UUID, Action (e.g. brew install fzf)

Removing the first column is pseudonymization, and thus qualifies as personal data under the GDPR. Removing the first and the second column is anonymisation and is not personal data.

Again IANAL, but it is clear from the GDPR that the only thing you could do without consent is e.g. recording what packages get installed/uninstalled, but without a UUID.

Apply the counterfactual: what would have to be the case in order to correlate the UUID in question with user data?

We do not store anything else that could correlate with that UUID. We don't expose it to anybody else and it's unclear how, even if we did, it would result in personal correlation.

Apply the counterfactual: what would have to be the case in order to correlate the UUID in question with user data?

We do not store anything else that could correlate with that UUID. We don't expose it to anybody else and it's unclear how, even if we did, it would result in personal correlation.

You can argue against this, but it's simply how the GDPR defines personal data, and if you violate it, someone could report you to their data protection authority.

Secondly, the GDPR does not just do this to protect citizens against direct use of their personal data (I think most Homebrew users would be immediately convinced that you wouldn't misuse this data, including me), but also scenarios that are outside of your control. Such as: Google decides to violate the GDPR against your will and correlates the data. Or: Google Analytics gets hacked, the data set becomes available on the black market or wherever and people correlate the data with other leaked data.

So, how would it be possible if it was correlated with other data?