| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by euske 1736 days ago
	This is the reason why Adobe PDF isn't relying on Unicode. Adobe products has a huge presence in Japan since 90s and they had to appeal to the printing industry, which is very anal to this kind of issues. So they ended up using a separate encoding for every language. Today, CJK letters in PDF are encoded in Adobe-GB1 (mainland China), Adobe-CNS1 (Hong Kong), Adobe-Japan1 and Adobe-Korea1 respectively. Not the cleanest way, but it gets the job done.

3 comments

makeitdouble 1736 days ago

Thanks for the pointer, that's pretty interesting.

Looking at their doc [0] it seems they used their Adobe-Japan1 to wrap a much more wider set of characters than any single encoding standard, including ligatures, vintage encodings etc.

It seems to be a pretty big work and kinda fits with the image of PDF handling being such a monumental beast.

[0] https://github.com/adobe-type-tools/Adobe-Japan1/

link

lifthrasiir 1736 days ago

Note that they are now adopted by the Unicode Ideographic Variation Database [1] among other variation databases.

[1] https://unicode.org/ivd/

link

ksec 1736 days ago

Adobe gets lots of stick for its subscription and malware like Creative Cloud. But they do spend huge amount of resources on CJK fonts, layout and encoding.

And part of the reason why I like PDF.

( Behind a Paywall ) https://ken-lunde.medium.com/my-28-years-of-adobelife-e97e70...

link