Hacker News new | ask | show | jobs
by codetrotter 1962 days ago
> yet I see uncommon values for WebGL Vendor and WebGL Renderer

I don’t know how Firefox does it but instead of trying to make the WebGL fingerprint the same for everyone every time they could also try to make it unique for everyone every time and it would have the same effect.

If every time you loaded a page your WebGL fingerprint differed then a website can’t use that to tell if it was the same browser that loaded the same page previously or any other page anywhere else previously.

(Assuming that the WebGL fingerprint anonymization was so good that it could indeed not be correlated between different fingerprints in any meaningful way.)

1 comments

Part of the problem with either approach is that certain identifying information is also necessary for just making the web work. For example, on the EFF's Cover Your Tracks page (what is basically a fingerprinting demonstration), it shows that the screen resolution of my monitor conveys 16.25 bits of information. I use a particularly wonky landscape display mounted in portrait mode, which doesn't help the matter, but there's a problem: we can't lie about it.

You see, a while back we decided to allow writing CSS that changed the design of a website based on the size of it's containing viewport. This is called "responsive design" and is very useful; however, it also means that websites rely on having a correct window size in order to display content correctly. We cannot be inconsistent about our lies: if we were to, say, lie about the screen resolution but still handle media queries faithfully, then not only can the fingerprinter see through our lie, it can use the fact that we lied as extra information. (Remember how DNT served as an effective tracking indicator?) So that would mean browsers would have to start, say, snapping browser windows to certain common viewports or capping the number of distinct breakpoints a website's CSS is allowed to have; both of which have UX or compatibility implications.

A better long term solution would be to make websites behave more like print has for the past 500 years, and only allow them to pull info from the server while greatly restricting what they can transmit back. Only allow POST buttons to send home strictly textual data the users themselves typed out. Yeah, this would break certain cool legit websites like those that run HTML5 games that need to constantly transmit info back to the server. But this wouldn't have to break sites like HN/Reddit/Twitter/news/science/whatever. If you wanted a web that could give you beautiful images and typography without fingerprinting, such a thing could be built. It would break big Adtech tracking though so we'll never see this simplifying improvement in a big-Adtech funded browser (like Chromium).
>and only allow them to pull info from the server while greatly restricting what they can transmit back

The first forms of user tracking involved 1px GIFs that existed purely so that the server could log the request. If you allow any code execution at all, then the client can send data back to the server by asking for data from the server. Reads are just bidirectional writes.

Those 1px GIFs were so that some server other than the one you are currently interacting with can track you. So if I go to nytimes.com I might get served a 1px GIF from BigAdTechCorp.com. The proposal is that all images, text, and data only come from the server you are currently pointing your browser at. So if you go to nytimes.com then only nytimes.com can send you text and images, only the nytimes.com server sees what content you request and when you request it. Once upon a time people purchased printed newspapers and magazines and there weren't all the invasive ways to spy on how long readers engaged with articles and images in said periodicals and yet we all managed. Marketing firms made ad buys all the same with this old tech and many successful ad campaigns happened, all without the invasive tracking.
> Only allow POST buttons to send home strictly textual data the users themselves typed out

I’m almost with you but in that case even shopping online would not work. Unless you force the user to manually type in the SKU of each item they want to add to their cart and so on. That’s not gonna happen :p

Good point. There would be a few details like this to work out, of course. My first suggestion would be to allow text-boxes to be pre-populated with form data (like QTY and SKU) subject to tight restrictions. (There would be no script code sniffing the user's fingerprint before the text boxes are populated with SKU data, for example.) So a web dev could still create a shopping page with Add-to-Cart buttons such that clicking the button tells the server the SKU and QTY added to cart for session SESSION_ID. And the user transparently observes exactly which sequence of Unicode codepoints is being transmitted to the server upon clicking the Add-to-Cart button.
Yep, despite good browser hygiene my most unique browser identifier is my display size and color depth (2560x1600x24) Guess my original Apple Cinema HD display will clearly identify me as a dumb cheapskate who pays for expensive things and then uses them until they break. I'm a terrible advertising target :)
You can lie about screen size. If you don't have your browser fullscreen then the browser can claim the screen size is the browser size. The TOR browser handles this by having certain fixed sizes meant for different sized displays so at worse you fall in to one of a few categories.
If you're claiming the screen size is the browser size, you've provided more fingerprintable information. Browser viewport sizes are more numerous than screen sizes.

I'm assuming Tor Browser doesn't actually alter the CSS stylesheet layout engine, so here's a quick way around that: Construct a CSS stylesheet that styles a div a certain way based on the media query size. Encode the success or failure of the media queries in the width of the element and then read out the width of the element to tell if the browser is lying to you. Even if the browser throttles CSS queries, you'll still know that a lie happened, which is an extra fingerprinting bit.

If you're thinking of prohibiting dynamic CSS, you'll break all sorts of harmless JS - and still not fix the problem. You could maintain two sets of computed layouts, of course, but that would break JavaScript layout scripts (e.g. masonry.js). The option that breaks no scripts is to lock browser widths to rendering at specific viewports - if Tor Browser does that, then it probably has adequately resisted this particular fingerprinting vector, at the expense of some user experience.

Just tested on latest TOR Browser. It limits the viewport sizes to a small set (multiples of 200px I think). The JS `screen.width` etc are undefined. The media queries can only query the viewport size, not the screen size.
> We cannot be inconsistent about our lies: if we were to, say, lie about the screen resolution but still handle media queries faithfully, then not only can the fingerprinter see through our lie, it can use the fact that we lied as extra information.

Depends on to whom we lie. The OS usually[1] should not lie to the browser and the browser has to know the truth to do its duty and for example render the page.

The browser lying to the server is a different story. Not a simple story, but in principle there is no reason the server needs to learn rendering details from the client.

[1] Location faking apps are one example where OS lying works with little downsides.

If the browser doesn't lie while rendering the page, then JavaScript can deduce the window size by measuring elements on the page. And obviously JavaScript can send this information to the server.