Hacker News new | ask | show | jobs
by 0x62 53 days ago
Unfortunately you've now made an incredibly niche browser, and the lack of those metrics is a good fingerprint by itself. How browsers render SVGs can be used for fingerprinting (even the underlying OS affects this, and I assume you'll want to see those), combine with ISP from IP address, and unless theres hundreds users in every city you're now pretty easily trackable.
2 comments

There's no problem with having a unique fingerprint. The problem is having a consistent one. Randomize the fingerprint every time and you're fine. The IP address problem applies to everyone, including anyone using tor browser. The only solution to that is not using your own IP address (VPN/proxy). If I were going to make a secure privacy focused browser it either wouldn't allow things like rendering SVGs (which have introduced vulnerabilities beyond tracking) and wouldn't allow much (if any) JS and only a sane subset of CSS.
> Unfortunately you've now made an incredibly niche browser, and the lack of those metrics is a good fingerprint by itself.

If 100 people are using that browser, how will they know which one is me?

> How browsers render SVGs can be used for fingerprinting (even the underlying OS affects this, and I assume you'll want to see those)

Can you provide details on this? And how will they know which OS I'm using (through SVG rendering...)? The UserAgent definitely should not send the OS.

> combine with ISP from IP address

That's already provided whether I use Private mode or not, correct? I can always use a VPN.

You're the only one out of 100 that visits HN, or who's use matches a particular timezone, or who has the use pattern that [anti-]correlates with your work pattern, or ...
My brain is a bit slow today:

> You're the only one out of 100 that visits HN

So the HN operator sees someone using this browser, with this timezone. Then I go to some other site. Let's pretend that site's operator and HN's are identical. How will they know that I'm the same guy who went to HN? How does he know there aren't two people who use the browser in the same timezone (and the other one doesn't go to HN)?

I think the point is that it takes very few data points to effectively deanonymize someone. And the less common a data point is, the greater the information gain. "User is male" eliminates ~half of users. "User actively reads HackerNews" eliminates >99%. "User uses this niche browser that only 1000 people have ever been seen using" eliminates 99.999%.

This is how surveillance operates at scale. You don't need a stable identifier linking a specific person's identity, you just need a few data points to narrow it down to even a few thousand people. Then you apply more focus on those people, gathering data points that eliminate people until you're left with your target. And thanks to decades of global iteration on surveillance infrastructure, and AI to glue data sets together, it's all automated.