Hacker News new | ask | show | jobs
by 0xfffafaCrash 320 days ago
not just UTF-16 but potentially malformed UTF-16 (supporting invalid surrogate pairs or surrogate halves and with js string functions computing things like lengths independently of UTF-16 characters)

This is also widely known as WTF-16 (seriously, look it up!)

1 comments

Most UTF-8-native stuff enforces well-formedness.

I have never in my entire life knowingly encountered a UTF-16-native system that enforced well-formedness.

In practice, UTF-16 means potentially-ill-formed UTF-16.