Hacker News new | ask | show | jobs
by drex04 1028 days ago
The Unicode bidi algorithm isn't failing here, it's working as designed. The source code snippet has base direction LTR, but has mixed-direction content because of the Hebrew text. Punctuation marks have weak directionality, so the period shows up at the end of the RTL run. Since the base direction is LTR, the 'end' of the RTL run means the right side.

If you want to force mixed-direction content to render correctly, you often need to insert bidirectional control characters to specify where a certain directional run begins/ends. That doesn't make sense to add in this case, though, because it would mess up the rendering in the actual input example.

2 comments

I would argue that it’s failing and working as designed.

Perhaps a syntax highlighter could learn to insert “first strong isolate” and “pop directional isolate” and to also enforce that content leaves the stack alone?

You are correct. I explain this in depth here:

https://dotancohen.com/howto/rtl_right_to_left.html