| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by estebank 2716 days ago

One nitpick: `.chars`[1] gives you an iterable[2] of `char`s[3], each of which is always a 4 byte representation of a valid unicode character. This means that `"asdf".chars().collect()` will have a different size to `"asdf"` and `"asdf".chars().as_str()`. `.chars()` will never give you an incomplete codepoint, but it will give you incomplete characters, as you could have many c̶̼̟̏ó̷̘̉n̴̖̞̏̇t̸̡̃ĭ̸̻̬n̴̯͉̂͑ṵ̴̑a̷̛̫̳ẗ̸͕́i̷̱̫̓̋ǫ̸̑ǹ̶̼̅s̸̩̾̌ to represent what visually are a single char.

[1]: https://doc.rust-lang.org/std/string/struct.String.html#meth...

[2]: https://doc.rust-lang.org/std/str/struct.Chars.html

[3]: https://doc.rust-lang.org/std/primitive.char.html

1 comments

moosingin3space 2716 days ago

> visually are a single char

IIRC that's what grapheme clusters are for.

link