Hacker News new | ask | show | jobs
by brainsmith 2302 days ago
Due to the the lack of native strings in WebAssembly different Wasm compilers have different memory layouts and string encodings. For example assemblyscript uses ucs2 for the sake fo compatibility with JavaScript. This obliges to carefully work with memory bounds, string length estimation due to difference in host native and guest string encodings.

For the specific goal of working with Strings in rust and assemblyscript I've created this project: https://github.com/onsails/wasmer-as.

2 comments

I've created an equivalent library for Python and used your project as a reference:

https://github.com/miracle2k/wasmbind

I'm wondering why AssemblyScript uses UCS-2 instead of UTF-8. Do browsers use UCS-2 as well?
AS as well as JS interpret code strings as UTF16-LE during follow methods: String.p.codePointAt, String.p.toUpperCase/toLowerCase, String.p.localeCompare, String.p.normalize, String.fromCodePoint, Array.from(str). In rest cases strings interprets as UCS-2.
The do in all the observable JS APIs, but behind the scenes there are a number of optimizations in each JS engine to deal with the fact that most JS and JSON source comes off the wire in UTF-8 or ASCII.