Hacker News new | ask | show | jobs
by nabla9 3137 days ago
I'm not familiar with Rust libraries, but I would guess it's just counting code points and not characters, so strictly speaking both are cheating.

I would love to see people showing how to do simple string processing, like counting characters in proper grapheme cluster level in their favorite programming language.

1 comments

   for (c1, c2) in val.chars().zip(val.chars().skip(1))
chars() iterates by unicode scalar values. It'd be bytes() for bytes.

If you wanted to do it by grapheme clusters, you'd add https://crates.io/crates/unicode-segmentation to your Cargo.toml, add the relevant imports you see on that page to your code, and change the above line to

   for (c1, c2) in UnicodeSegmentation::graphemes(val, true).zip(UnicodeSegmentation::graphemes(val, true).skip(1))
... possibly splitting that up into variables becuase dang, that's a long line.

Then, you're getting &strs instead of chars for the iteration, but I think the body still says the same, as == checks by value.