|
|
|
|
|
by teddyh
1325 days ago
|
|
An alternate view: “string” is not a granular enough type, just like “bitfield” is not a type. Firstly, a string could be raw unknown bytes, verified UTF-8, or UCS-2 (or even UTF-16 or UCS-4), and you absolutely need to know which it is. But let’s assume that you’ve been a diligent programmer and filtered all that at the edges, and now have a sequence of Unicode code points (or possibly graphemes). You still need to know the escaped-ness of the string! This is also a form of typing. Perl was early with its concept of “tainted” strings, but modern languages can use types to mark this concept in the code. At all points in your code, you should be sure what type the value you have is. If you need to use the types in your language to ensure this, then use types. But make sure of it somehow. |
|
This is a language defect. If your language was invented in the 1960s it's an understandable defect, but it's still a defect. I do not want to write computer software with strings in a language that doesn't even have an actual string type rather than "Eh, maybe this is a string or maybe it's just some random bytes, who cares".
Only in very low level software should it make a difference whether the string is in fact represented as UTF-8 or UTF-16 or whatever, but Rust shows that you can write software at a low level and still enforce type safety for strings.
I agree though that here once again the Right Thing™ is a strong type system. If I've got a Microsoft Graph username, a URL, an email address and a UUID, that's four types, those are not four strings with human names to distinguish them. We don't need to escape some or any of these types - in their context.