|
|
|
|
|
by dotancohen
3207 days ago
|
|
> So, don't decode to a string, and do all your character manipulation on the bytes. WHAT?!? I suppose that you've only ever worked with Latin characters. Please show a code example of changing European to African in this sentence in your language of choice, working on the bytes in any multi-byte encoding: מהי מהירות האווירית של סנונית ארופאית ללא משא? Yes, that is a Hebrew Monty Python quote. Now try it with a smiley somewhere in the string (HN filtered out my attempt to post the string with a smiley). Is each application to maintain their own dictionary of code points? If the map is to be in a library, then why not have it in the language itself? |
|
> Please show a code example of changing European to African in this sentence in your language of choice, working on the bytes in any multi-byte encoding:
מהי מהירות האווירית של סנונית ארופאית ללא משא?
I don't see the string 'European' in that sentence, it seems to be solely comprised of Hebrew characters.
edit to attempt to answer your question:
splicesn should be obvious, and you normalise your strings before calling replacesn. This is just me crappily re-implementing a fraction of the wchar API without checking MSDN.edit 2:
> Is each application to maintain their own dictionary of code points?
No, you use the system/standard library for composing/decomposing/normalising codepoints.
> If the map is to be in a library, then why not have it in the language itself?
Why not indeed? What a great idea.