Hacker News new | ask | show | jobs
by hsivonen 2249 days ago
What’s the current committee thinking on providing locale-independent conversions from potentially-invalid UTF-8 to valid UTF-8, from potentially-invalid UTF-8 to valid UTF-16, and from potentially-invalid UTF-16 to valid UTF-8 (i.e. replacing ill-formed sequences with yhe REPLACEMENT CHARACTER)?
1 comments

If you changed UTF-16 to UTF-32 or UCS-4 I'd support it. I think there are already implementations that use the replacement character for all "impossible" codes.
What’s your use case for UTF-32?
There are several multibyte character manipulations that are easier if there is a uniform-sized encoding (wchar_t).