|
|
|
|
|
by dmsnell
366 days ago
|
|
Unicode has a range of Tag Characters, created for marking regions of text as coming from another language. These were deprecated for this purpose in favor of higher level marking (such as HTML tags), but the characters still exist. They are special because they are invisible and sequences of them behave as a single character for cursor movement. They mirror ASCII so you can encode arbitrary JSON or other data inside them. Quite suitable for marking LLM-generated spans, as long as you don’t mind annoying people with hidden data or deprecated usage. https://en.m.wikipedia.org/wiki/Tags_(Unicode_block) |
|