Hacker News new | ask | show | jobs
by arama471 2788 days ago
Not necessarily - it could just change your anonymous ID frequently enough that it would be impossible to know which vehicle did what. If you can't track a single car then you don't know if a car started and another parked where it used to be or if a car just drove past a parked car (assuming the data is not unreasonably precise).
1 comments

How frequently do you propose? If it would change your 'anonymous' ID e.g. daily then all it has to do is look at the other data where cars seemingly move to and from the very same home address to the very same work address with a strange pause at work for ~8 hours and a strange pause between ~10 PM and ~7 AM due to sleep.

The device discussed in the article also knows our own SSID and surrounding SSIDs (supposedly it has WLAN to communicate) and can therefore also figure out where you live using e.g. Wigle [1].

A way to actually anonymise the data would be to use hashes and only use those instead.

[1] https://wigle.net/

An interesting anonymization - useless for instant traffic updates, though - would be to fuzz the location based on vehicle speed. When you're moving quickly, very precise data would be shared, but when you're stopped, uncertainty would be increased - that way, where you're parked every night would be much less precise than the highway you take to work.

This wouldn't do much good in more rural areas, of course - you could probably zero in on exactly where my parents park their car with a month's worth of data even if you add a mile-wide fuzz to every parked check-in.

Need some sort of deterministic bias based on user ID? Each user offset 1*Math.rand(hashseed) miles from location at randVector(hashseed)
Why does it have to be deterministic per-user? Why not completely randomized across every user?
I guess depends on the application. Consistent offset would allow measurement of distances, but that does leak some information, so perhaps all-random is better for many uses.
It could be done in a such a way that no device identifier or any other identifier is ever sent. Just (time, current lat/lng, previous lat/lng)