| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Aransentin 786 days ago
	There are some hard-to-handle edge cases when doing display length truncation in Unicode, e.g. the character U+FDFD or "﷽" is four bytes but can be very long depending on the typeface, so "completely" solving it is quite hard and has to depend on feedback from your rasterization engine. (Rendered version on Wikipedia: https://commons.wikimedia.org/wiki/File:Lateef_unicode_U%2BF... )

1 comments

account42 786 days ago

This is a completely unrelated problem since the article is quite clearly about limiting to a certain maximum byte length and not display length. For display length you don't even need Unicode for that to depend on the font and shaping engine.

link