Hacker News new | ask | show | jobs
by SideburnsOfDoom 3297 days ago
> break in subtle ways when they handle characters that span multiple codepoints

Or equivalently: there is more than one way to turn a string into a list. It can e.g. be a sequence of bytes, unicode chars or grapheme clusters. Being explicit about the conversion is therefore a good idea.

1 comments

Don't forget splitting on word boundaries and/or whitespace - going from a string of text to an iterable collection of words (strings).
Or for the case of (e.g.) domain names, splitting on dots. Generally, given a collection of split chars, breaking the string into a collection of substrings.