I'm not going to enter the discussion regarding UTF-8 vs WTF-16 for representing strings, as I lack the context to determine which one is the right approach if everything has to fit the same model. However, I think an approach that allows multiple serialization/deserialization mechanisms depending on the host/guest language seems like a nice way to move it forward.
If you want to chime in and retrieve more context, here are some relevant issues: