Hacker News new | ask | show | jobs
by chucksmash 3214 days ago
> Why optimize your syntax with `len(txt)` and `txt[0]` when, as you point out, you can't do that?

I like Rust's approach. It's a strings-are-UTF8 language but strings (both str and String):

- are not directly indexable

- force you to be explicit when iterating: you iterate over either `s.chars()` or `s.bytes()`

- are called out in the docs as being a vector of unsigned 8-bit integers internally

- support a len() method that is called out as returning the length of that vector

- can be sliced if you reaaaally need to get around inability to index directly but attempting to slice in the middle of a character causes a panic

2 comments

> - support a len() method that is called out as returning the length of that vector

They should have called that one bytelen() then.

And how do you get a proper offset for slicing? Do you then have to interpret the UTF-8 bytes yourself, or can you somehow get it via the chars() iterator or something similar?

Yeah this is the way to go for sure.