|
|
|
|
|
by jheriko
4040 days ago
|
|
i think linux/mac systems default to UCS-4, certainly the libc implementations of wcs* do. i agree its a flawed idea though. 4 billion characters seems like enough for now, but i'd guess UTF-32 will need extending to 64 too... and actually how about decoupling the size from the data entirely? it works well enough in the general case of /every type of data we know about/ that i'm pretty sure this specialised use case is not very special. |
|
That's not remotely comparable to the situation in Windows, where file names are stored on disk in a 16 bit not-quite-wide-character encoding, etc... And it's leaked into firmware. GPT partition names and UEFI variables are 16 bit despite never once being used to store anything but ASCII, etc... All that software is, broadly, incompatible and buggy (and of questionable security) when faced with new code points.