|
|
|
|
|
by fauigerzigerk
3371 days ago
|
|
The point is, if you handle strings the C way, you're not in conformance with UTF-8. If someone passes you a text file that is verified to be valid UTF-8 and contains, say, access permissions, then you better not stop parsing it at the first '\0' character. None of this is a huge problem, but it's something to be aware of. C string handling is incompatible with UTF-8. |
|
That's separate from string handling.
UTF-8 was originally designed to be compatible with NUL terminated strings and keep NULs out of well formed text.
In fact it was the first point in the 'Criteria for the Transformation Format', mentioned in the initial proposal for utf8.
https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt