|
|
|
|
|
by jstimpfle
143 days ago
|
|
First, because of existing constraints like mutability though direct buffer access, a hypothetical codepoints_size() would require recomputation each time which would be prohibitively expensive, in particular because std::string is virtually unbounded. Second, there is also no way to be able to guarantee that a string encodes valid UTF-8, it could just be whatever. You can still just use std::string to store valid encoded UTF-8, you just have to be a little bit careful. And functions like codepoints_size() are pretty fringe -- unless you're not doing specialized Unicode transformations, it's more typical to just treat strings as opaque byte slices in a typical C++ application. |
|