|
|
|
|
|
by loeg
2043 days ago
|
|
> so there's basically zero chance that if you choose something like comma as your delimiter that it's going to tokenize the middle of a multibyte sequence. Not just "basically;" there is no possible collision between ASCII characters and any valid multibyte encoding. This can be seen somewhat visually in this table[1] and is an intentional aspect of the UTF-8 design. [1]: https://en.wikipedia.org/wiki/UTF-8#Encoding |
|