Hacker News new | ask | show | jobs
by hsivonen 2468 days ago
> They are sequences of _scalar values_ (all scalar values are code points but surrogate code points are not scalar values). Exposing the scalar value length and exposing indexability by scalar value index is the same as "(guaranteed-valid) UTF-32 semantics".

Sorry. I'm shocked that I tested wrong when researching the article. Python 3 indeed has code point semantics and not scalar value semantics. I've added a note to the article that I've edited in corrections accordingly.

Python 3 is even more messed up than I thought!