|
|
|
|
|
by jeroenhd
972 days ago
|
|
Among the telemetry data: > MacAddressHash - Used to identify a user of VS Code. This is hashed once on the client side and then hashed again on the pipeline side to make it impossible to identify a given user. On VS Code for the Web, a UUID is generated for this case. A hash of a hash is about as expansive as a hash and it still uniquely identifies a machine, tying telemetry events to a specific user's machine. Microsoft's own telemetry description generator calls the field "EndUserPseudonymizedInformation". Pseudonymisation is inherently not anonymisation. This bullshit is why I keep my PiHole on for my dev environment. |
|
It’s important though if you e.g have multiple products to use a _different_ pseudonymization (hash salt or whatever) otherwise you run the risk of storing data linking too much data on a user thereby de-pseudonymizing them in the worst case even though no individual app does. Having a users behavior across multiple applications could pose such a risk in extreme cases.
Edit: I think it's important to separate "hashing" and "hashing". A properly hashed identifier uses a salt that is generated on the client, so that it can't be used to identify the user. basically: the first time the app runs, you generate a random salt which is only stored on the client, and NEVER sent in telemetry. Anything you would like to transmit over the wire that would risk identifying the user (E.g. a computer name, mac address) you hash with this local salt. This way no one can try to go to the database on the server side and try to match any data e.g. check if the hash abc123 matches the computername jimbob bcause hash("jimbob")= abc123. Just sending hash(MacAddress) without a local random salt would NOT be properly pseudonymous because an attacker on the server side could ask and answer the the question "Does this come from the address macaddress?".