|
|
|
|
|
by skizm
4723 days ago
|
|
I think the majority of the time you want the number of utf8 characters and not the byte count. In fact I have never wanted the byte count. If I did I would expect something like byteLen and Len. Not the other way around. You should be optimizing the common case, not the exception. Obviously I'm not a language designer so perhaps I'm talking out of my ass but I've heard this complaint A LOT. |
|
But that doesn't matter. Let's say you are correct: when working with strings, you more often want the rune length. It still wouldn't be the right decision, given the other design decisions of Go, because it would have needlessly complicated things with only arguable benefits. Let me show you what I mean.
The len() function works with a whole lot of things: strings, arrays, slices, maps and channels. For the first three, len() returns the number of bytes involved. This is because all three are backed by an array, and so sensibly have similar semantics. It would have violated the principal of least surprise for anyone who knew the language to have an array-backed storage not return a byte count. Both the language developers and the users of it would have to special-case strings, in code and in their brains.
Now, they could have decided to do it anyway, but then another surprise awaits. What happens when you take a slice of a string? Oh no, more special casing and more complication for everyone.
The Go developers do special-case where doing so would clearly be a win for their users. Consider range, which iterates by runes over a string, potentially moving the index on the underlying array forward by more than 1 on each pass. That is clearly going to be the most common usecase the user is going to want and so was worth doing. It also eliminates many of the usecases where getting the length of a string in runes would matter to you. Not all, but a lot.