Hacker News new | ask | show | jobs
by duskwuff 4170 days ago
The documentation you're looking at is quite old; I'd recommend looking at the current version for a better view. (The implementation is largely the same; the documentation has just improved quite a bit since then.)

http://perldoc.perl.org/perluniintro.html

Anyways, Perl5's Unicode support is quite different from Python's.

Specifically, it doesn't have distinct "Unicode string" and "byte string" types; instead, it has a single unified string type. These strings may be internally stored as Latin1 or UTF-8, depending on how they were created, but they behave identically for almost[1] all purposes, and there are easy ways to force Perl to convert between the formats. It's still possible to create a nonsensical string if you do something silly like append Unicode characters to a string containing raw UTF-8 data, but that's not something the language can entirely protect you from.

[1]: The only exceptions I'm aware of are functions which explicitly operate on the utf8 status, like utf8::is_utf8(), and bitwise operations like &|^ and vec().