| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by float4 1803 days ago

The only thing I like about PDF compared to HTML is that with PDF, I know for a fact that no web requests are made in the background. That means no fingerprinting, no analytics etc.

With HTML, I have to trust that some random entity does what they state in their privacy policy, and they regularly don't. Sure, I can disable JS, but then 95% of the web doesn't work anymore.

Other than that PDF is quite clearly a less accessible format.

6 comments

robin_reala 1803 days ago

How do you know for a fact? PDF has JS in the spec, and it supports SOAP and Web Services. Have a look at https://www.adobe.com/go/acrobatsdk_jsdevguide

float4 1803 days ago

That's not the PDF spec is it? That is a spec for Adobe Acrobat, which is not allowed to make any web requests thanks to my application firewall (Little Snitch).

Pretty sure a PDF opened in the browser can't run any JS, but not completely sure. So you're right: I don't really know it for a fact. Poor choice of words.

robin_reala 1803 days ago

The spec is ISO 32000, and it’s expensive and closed, so difficult to reference. But according to Wikipedia at least, JavaScript is normative in it. No idea if SOAP / Web Services is part of it though.

jl6 1803 days ago

The spec for PDF 1.7 is here: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PD...

JavaScript is allowed, but not in PDF/A, which is what I use.

The PDF 2.0 spec is damnably not public.

the8472 1803 days ago

But you can't easily tell PDF/A and regular PDF apart, so we're back to the same situation as HTML vs. HTML with javascript turned off.

grncdr 1803 days ago

Are you sure? I was under the impression that PDFs can reference web resources, and this is why there are more stringent standards for archiving (PDF/A and friends)

account42 1803 days ago

> With HTML, I have to trust that some random entity does what they state in their privacy policy, and they regularly don't. Sure, I can disable JS, but then 95% of the web doesn't work anymore.

If you only allow PDF, then 99.9999% of the web doesn't work anymore.

I'm all for getting sites to be static, but PDF doesn't fix that because the problem has never been the technology used to build the site.

jefftk 1803 days ago

How sure are you that there are no network requests happening? I tried to look this up and wasn't able to find any clear answer.

(It looks like at least some PDF readers have provided support for automatically displaying external images, for example)

foobar33333 1803 days ago

The full PDF spec is insane and allows for web requests and javascript. Most readers do not implement the anti features but adobe's tools will.

deregulateMed 1803 days ago

You are fingerprinted when you find the web link.

float4 1803 days ago

When I click a link you mean? Definitely true, but that way they only have access to my IP and user agent, which is still better than all the WebGL, Font library, display calibration settings, mouse movement etc. that they use otherwise.

I often use Tor, although I'm pretty sure that even then, a good analytics lib can see it's me based on scroll behaviour, mouse movement, time of day, and of course what I browse.

But yeah, you make a good point.

deregulateMed 1803 days ago

Where do you get the link?

float4 1803 days ago

DDG mostly, and they don't track users.

deregulateMed 1803 days ago

Your device, your device version, screen size, browser, browser version, IP address, etc... Are all tracked regardless.

You might not be a unique fingerprint, but at best you are part of a group of somewhere between 3 and 1000 similar users.

Not to be a downer, but when I webscraped I learned that big corporations can spend money to fingerprint you.

andrepd 1803 days ago

Why?

andrepd 1803 days ago

You can not use js on your website.