Hacker News new | ask | show | jobs
by CyberShadow 2810 days ago
> 1. it has proper, validated unicode strings (though the stdlib is not grapheme-aware so manipulating these strings is not ideal)

Sigh, I hoped newer languages would avoid D's mistake. Auto-decoding is slow, unnecessary in most cases, and still only gives you partial correctness, depending on what you're trying to do. It also means that even the simplest string operations may fail, which has consequences on the complexity of the API.

2 comments

I have no idea what you're talking about. Rust rarely does auto- anything, and certainly does not decode (or transcode) strings without an explicit request by the developer: Rust strings are not inherently iterable. As developer, you specifically request an iterable on code units or code points (or grapheme clusters or words through https://unicode-rs.github.io/unicode-segmentation)
I see, thanks for the clarification - looks like I mis-extrapolated from your comment.
That sounds more like an implementation issue than a design issue. If you are using UTF-8 actual decoding into Unicode code points is not necessary for most operations and Rust will not do that.

It also does not imply that string operations may fail. String construction from raw bytes may fail, but otherwise the use of UTF-8 strings should not introduce additional failure conditions.