|
|
|
|
|
by neild
4723 days ago
|
|
It is very seldom that you really want to deal with a string as an array of runes. (If actually you do want to, Go makes it fairly easy: Just use []rune rather than string.) Consider a simple string: "école". How many runes does it contain? Possibly five: LATIN SMALL LETTER E WITH ACUTE
LATIN SMALL LETTER C
LATIN SMALL LETTER O
LATIN SMALL LETTER L
LATIN SMALL LEtTER E
Possibly six: LATIN SMALL LETTER E
COMBINING ACUTE ACCENT
LATIN SMALL LETTER C
LATIN SMALL LETTER O
LATIN SMALL LETTER L
LATIN SMALL LEtTER E
If you normalize the string you can guarantee you have the first form, but not every glyph can be represented as a single rune.Fortunately, you generally don't need to deal with any of this. If you're working with filenames, for example, you really only care about the path separator ('/' or '\' or whatever); everything else is just a bunch of opaque data. You can write a perfectly valid function to split a filename into components without understanding anything about combining characters. When you're dealing with data in this fashion, you rarely if ever care about the number of runes in a string; instead you care about the position of specific runes. |
|
In Go:
In Python: