Hacker News new | ask | show | jobs
by fsckboy 973 days ago
what's the definition of truly anonymous? they don't know your name? or there isn't enough data to identify you? I've heard that in the US, birthday and postal zip code is enough to identify you in most of the country, but that could be considered anonymous.

if data of multiple users is aggregated, that is I think more of what people are thinking when they think "anonymous"

1 comments

There are multiple definitions. The most basic (and common) is k-anonymity [1]. Basically, for a given collection of data you group by all variables that are already non-anonymous (like age, address, gender, occupation) and end up with groups of fewer than k people (where k=5 is common), any other data items in the data set linked to the same individual also become non-anonymous (PII).

Even if you have groups of size greater than k, though, information elements may be non-anonymous if there is not enough diversity in the group. For instance, if every 49-year-old male on a given postal code in a given occupation has a certain religion, then religion is non-anonymous for that group, according to l-diversity [2].

This can be narrowed down even more by t-closeness [3].

  [1] https://en.wikipedia.org/wiki/K-anonymity
  [2] https://en.wikipedia.org/wiki/L-diversity
  [3] https://en.wikipedia.org/wiki/T-closeness