Hacker News new | ask | show | jobs
by code_duck 2931 days ago
The difference is how the data is used and the whether it's associated with an individual's permanent personal data. If it's only gathered anonymously and used for internal UI improvement, that isn't objectionable. At the other extreme, association with a real-world individual enables many uses that are potentially harmful, such as unmasking those who legitimately prefer to be anonymous.

Edit: I wanted to add that I didn't intent to focus on whether the data is shared. I think FB having and using it is bad enough, especially if they're ubiquitous. Also, once anyone creates such data, other entities such as governments will seek to obtain it and likely do so eventually.

3 comments

Anonymous data is one identifier or clever match away from identification. This is particularly severe for sites/services/hardware that records audio (man, who would knowingly allow that kind of abuse??), but it can apply to mouse tracking too. Mouse tracking fingerprints could be used to re-identify all sorts of other things.
It depends how much data is collected in the first place, and how much is available to the person trying to break anonymization. If I'm not mistaken, everything is deanonymizable with global traffic analysis.
For some people, there is no difference ultimately. Because it's not that they aren't doing anything malicious with the collected data now, it's about the fact that they CAN do it if they desire to (ethical or not) in the future--the data is being collected, stored (perhaps indefinitely), and will always be accessible. In a world where capitalism reigns, to think that any large corporate business would treat our data with the best care and in our bests interests seems a little silly. I have always held the belief that businesses are not people and it's reasonable to expect that businesses may not be inclined to always do the right thing, especially if it gains them more money and power.
I'm picturing the minimum being a system that collects nothing more than a mouse movements, rather than also IP address, full URL, user account, and other details that could easily tie it to a person. I mean more the minimum as an abstract ideal, than something anyone actually does.
There is no such thing as anonymous data. The belief that something is anonymous is simply an aspect of statistical ignorance or naiveté.

And, since we can't know in advance all the ways data can be combined, recombined, projected, and analyzed there is no such thing as informed consent to use said data unless specifically restricted to a single analysis using only given data.

I realize this, but I can picture a bare minimum store of heatmap generated data that would be extremely difficult to use for anything other than knowing what people on the website clicked on. Indeed, the more info collected, the more likely someone can combine it with other data to make broader conclusions.

Such as, any time you store a precise time in connection with user actions that has privacy implications. I picture simply not recording the time or exact URL. If the system is designed without any sort of privacy in mind, and just records whatever data is convenient and too much, that's easier to abuse than one that intentionally records a minimum with privacy in mind. I agree it's amazing the way all of this can be subverted, and yes, I realize that HN is stocked with data scientists who are more knowledgeable about this than I am.