|
|
|
|
|
by xyzxyz998
3275 days ago
|
|
That was informative, thank you. I think a lot of people in the dev/power uesr community would mind paying $1 for a Kindle ebook where you note all your findings. There have been so many instances where I wanted to do stuff with pdfs but ended up deflated. > subset fonts So you mean if a font has been embedded with three glyphs, 0x41=A, 0x61=a, 0x62=b, then string Aba would be \1\3\2? |
|
You can get a good overview of the state of the fonts in your PDF using:
There's a column which tells you if there's s Unicode map available for the font. That's important. Because PDF is just rendering glyphs at positions, it doesn't even know what the character names are. To allow you to copy and paste, most fonts in most pdfs will have a Unicode map from the glyph id to the Unicode symbol.If that's not available, in some cases you can rebuild it yourself by looking at the character encodings and substitutions.
On the book, do you have any examples? I'll probably never get around to writing anything down, but if it looks easy enough it's probably worth having a stab at.
Also, large caveat, I'm not a PDF or font expert. I've probably decimated the terminology here but hopefully it gives you a rough idea.