|
|
|
|
|
by dsacco
3025 days ago
|
|
I’ve done research on credit card data like that. I can tell you both experientially and mathematically that four bits of random information is insufficient to identify people. The information was not anonymized and they were tracking people engaging in a common, narrow activity. Not only that, but they were only tracking 1.1 million individuals. They had a relatively small search space and significant non-random information with which to bootstrap the deanonymization. Calling that “four bits” is disingenuous. Contrast this with trying to identify a single individual in a population with no other information about them. It would take about 33 bits if we knew absolutely nothing about her, given log_2(7,280,000,000) = 32.7. But we know she’s American, so we can cut our search space down to 322,000,000. That leaves us with 28 bits. We also know she’s a woman, so we can cut our search space down by 50%. Now we have 27 bits to go. I can virtually guarantee you an analysis of anonymous donation patterns will not meaningfully cut down the search space beyond a few more bits, and that’s exceptionally non-random data. The more useful information is knowing that she resides in New Hampshire, but that still only brings us down to approximately 20 bits. |
|