|
|
|
|
|
by Spivak
2029 days ago
|
|
It's an easy-to-state largely foolproof test to see if data really is anonymized. The thing that you're worried about with poorly-anonymized datasets is that if you have another non-anonymized dataset you can combine them to deduce the original information. "Your data set must not be able to be combined with any others that would allow them to infer the original data" is hard. How could you possibly test them all? Well it turns out that there is one such non-anonymized dataset with the property that if you can't connect your anonymized data with it at all then you can be pretty sure that you couldn't connect them with any others -- the original data! |
|