Hacker News new | ask | show | jobs
by GuB-42 408 days ago
PDF is effectively digital paper, and it works really well for this. When I made PDFs 20 years ago, I knew they will always look the same on every device, including on paper, and they did, and they still do. In addition, a document is a single file, reasonably compact, looks good on any resolution, and is generally searchable. Even if not ideal, it can also support scans of paper documents in a way that can be sent to a printer on the other side of the planet and you will get the same result as if you had used a copier.

Data extraction is hard, but that's not what it is designed for, it is for people to read, like paper documents.

Far from being "mad", it is remarkably stable. It has some crazy features, and it is not designed for data extraction (but doesn't actively prevent it!). But look at the alternative. Word documents? Html? Svg? One of the zillion XML-based document formats? Markdown? Is any one of these suitable for writing, say, a scientific paper (with maths, tables, graphics...) in a way that is readable by a human on a computer or in print and will still be in decades and that is easier to process by a machine than a PDF?