Hacker News new | ask | show | jobs
by dfox 2392 days ago
The main point that should be emphasised is that any encoding with fixed size unicode codepoints is mostly unnecessary as you mostly don’t care about the codepoints but about how the resulting glyphs or even glyph runs look like.

My experience is that if you want to implement efficient unicode-aware text editor then the right datastructure is list of lines and you have to simply forget about gap buffers, ropes and what not (unless you really care about 32k+ lines/paragraphs, which is when rope-style representation starts to make sense as long as the breaks match unicode semantics)