|
|
|
|
|
by a785236
2473 days ago
|
|
I wish the authors wouldn't oversell the privacy claim: > Github: "The DeepPrivacy GAN never sees any privacy sensitive information, ensuring a fully anonymized image." > Abstract: "We ensure total anonymization of all faces in an image by generating images exclusively on privacy-safe information." > Paper: "We propose a novel generator architecture to anonymize faces, which ensures 100% removal of privacy-sensitive information in the original face." Changing a face anonymizes an image the same way that removing a name anonymizes a dataset -- poorly. This is cool, but it's not anonymization. |
|
For clarity it might be good to establish what I mean when I talk about three terms: "identifiable" is either the original, encrypted with the key available, or a hashed version or bloom filter (or so) of low-entropy data such as email addresses or phone numbers; "pseudonymous" is replacing the data with a unique but disconnected value (e.g. a UUID, or encrypted with a random key and key destroyed); and "anonymous" is either no data, or data that has no relation to the original.
As far as I can tell, this algorithm replaces the data with a random value that has no relation to the original. I understand that if we have a list of HN comment metadata and you remove the usernames ("anonymize"), you can still find me by the time of posting correlated to DNS request logs at the ISP. In the case of pictures, I guess the place is usually identifiable + the time is known, thus you can potentially piece together who was there at that time, corroborated by the presence of a certain backpack or shirt.
Is that what you mean, or is there something else that makes you say it is either still identifiable or pseudonymized rather than anonymized?