| While researching this comment I read some of the D library documentation and found what I think is probably a docbug at this URL: https://dlang.org/phobos/std_utf.html#.byUTF "Throws: UTFException if invalid UTF sequence and useReplacementDchar is set to UseReplacementDchar.yes" My guess is that this is a mistake and should instead say UseReplacementDchar.no since it makes sense to throw an exception if you can't use U+FFFD here, rather than do both. Anyway, in my view this is bad the same way the Billion Dollar Mistake is bad, and Rust made the right choice here. Arrays of stuff are great, but they aren't strings. Having to sprinkle "or maybe not" cases all over these libraries because of course these might not really be strings, results in exception fatigue from your developers, which in turn results in lower quality software and more effort for the conscientious developers who stick it out. D's strings are less stupid than C's (and thus some of the C++ strings) but they're still just arrays which are maybe but maybe not actually text. |
Having string be a magic builtin type does not eliminate the problem of dealing with invalid UTF sequences.
Invalid UTF sequences are inherent to the Unicode design, and programmers are left on their own to deal with it. The options are:
1. ignore them
2. use the replacement char
3. throw an exception (or other error indication)
D enables the programmer to pick which they need, on a case by case basis.