Hacker News new | ask | show | jobs
by ryandrake 3074 days ago
A textbook example is the AOL search history release [1]. They went to the trouble of wiping user account information but left anonymous (but unique) per-user numeric identifiers. Oops, someone didn't think that one through.

1: https://en.wikipedia.org/wiki/AOL_search_data_leak

1 comments

How is that an example of one of the failures of anonimization of data? To me it just looks like AOL did a shitty job, not that the concept as a whole is a lost cause.
Most breaches happen because someone "did a shitty job".

The truth is, you're doing a shitty job if you don't recognize anonymization for what it is - essentially trying to have a cake and eat it too. In practice, it has specific constraints that must be met, and I'd judge the difficulty of doing a good job here to be similar to rolling out your own crypto. That is, unless you're a good statistician, you're better off not sharing the data (or not having it in the first place) than releasing it "anonymized".