Hacker News new | ask | show | jobs
by adrianN 3297 days ago
Not if the "iterating over character" function iterates over actual characters and not codepoints.
2 comments

You mean grapheme clusters? Swift is the only language I know that uses that by default, and you still wouldn't want to store strings as a list of grapheme clusters.
I believe Perl 6 does so as well, see e.g. https://perl6advent.wordpress.com/2015/12/07/day-7-unicode-p...
The apple dev documentation has a nice overview of some of the concerns that need to be taken into account for this to work:

https://developer.apple.com/library/content/documentation/Co...

It's probably one of the better approaches - but it's still not clear if it (alone) allows a developer that speaks only English to develop a text indexing or editing system that works well across English, Japanese, Arabic, Hangul and Dutch for example.

Elixir as well.
The problem is “actual characters” are an ill-defined term; that could mean either code points or graphemes. See, e.g., http://unicode.org/faq/char_combmark.html