|
|
|
|
|
by Grinnz
2306 days ago
|
|
Perl (5) is array-of-codepoint based, at the logical level. Those codepoints might be internally stored as their encoded-to-UTF-8-bytes, or they might not, but this does not affect the usage of the string. Many don't really understand the string model (because many don't really understand character encoding) but it comes down to: all input and output is going to be bytes, which by default is stored as the codepoints sharing the ordinals of those bytes, and there are several mechanisms by which you can manually or automatically decode/encode those byte ordinals to the represented characters; for most text processing, you do this on STDIN and STDOUT. |
|