Hacker News new | ask | show | jobs
by mikepurvis 1928 days ago
I think it still matters— maybe it's way harder to track an individual user, but you could still do some inference about queries which lead to other queries, or watching for bursts/crescendos of traffic on a particular topic in response to current events.
1 comments

Your first claim, I'm suspicious this can be done after a certain scale. If DDG is sending queries from millions of users, it's much harder to untangle them.

Your second query, absolutely, Microsoft is getting very good aggregated data from their API, which is useful to them, but that's not really a privacy violation for individual users.

Yeah, that's fair— once the queries are anonymized and stripped of any locale information, there isn't too much more to go on. And while there may be technical reasons to want to cache popular searches, then you're mostly just denying your upstream analytics which are fairly reasonable for them to want to have.

OTOH, Debian deliberately provides the technical means for third parties to host verifiable mirrors of their package repository, and then makes the analytics an opt-in thing (popcon).