|
|
|
|
|
by dcode
1303 days ago
|
|
I do understand the desire to switch all languages and systems to one encoding, of course. However, Switching a WTF-16 language to UTF-8 removes previously valid values from strings, then exchanging what's errors/mutation on Component boundaries right now with errors/mutation when using string APIs. Can't be done in a backwards-compatible way, and all these languages have a lot of existing code. If backwards compatibility is a goal (say when using a breadcrumbs mechanism as in Swift), one still ends up with WTF-8 underneath, which maps to WTF-16, but is not UTF-8. Hence why I think it's impossible, because the only way to pull this off is by replacing affected string APIs (and/or accepting that old APIs then throw or mutate). Likewise, I see a possible future where JS adopts breadcrumbs, but then with WTF-8 (and perhaps a well-formedness flag), not guaranteed UTF-8. In your use case, that would yield a fast-path if a string is well-formed, but still with the same old fallback. Plus, of course, that having a systems fast-path implies that there is a corresponding JS-interop slow-path (when using AS). |
|
I expect other languages to make the switch sooner or later.
I do think though that this is not all that interesting for the issue here. WASI needs to pick some format and picking UTF-8 is fine. Roundtripping half broken UTF-16 is something that does not need preserving.
I think enforcing UTF-8 there won’t be much of an issue in practice.