| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hsivonen 2297 days ago
	What’s the current committee thinking on providing locale-independent conversions from potentially-invalid UTF-8 to valid UTF-8, from potentially-invalid UTF-8 to valid UTF-16, and from potentially-invalid UTF-16 to valid UTF-8 (i.e. replacing ill-formed sequences with yhe REPLACEMENT CHARACTER)?

1 comments

DougGwyn 2297 days ago

If you changed UTF-16 to UTF-32 or UCS-4 I'd support it. I think there are already implementations that use the replacement character for all "impossible" codes.

link

hsivonen 2296 days ago

What’s your use case for UTF-32?

link

DougGwyn 2294 days ago

There are several multibyte character manipulations that are easier if there is a uniform-sized encoding (wchar_t).

link