Hacker News new | ask | show | jobs
by zdw 1535 days ago
HTML is somewhat human readable in a text editor, but PDF likely is not.
1 comments

Actually a lot of the PDF format is plain text, but can contain binary streams. You can open a PDF in a text editor and see the header, and skip to the end and see the xref index and some other parts. The binary sections are enclosed in plain text start and end markers, but you probably won't be able to read much of the actual content this way since it will be compressed or encrypted.
Sometimes even the binary fragments are compressed plain text.