|
|
|
|
|
by lagom
4723 days ago
|
|
Aside from API, there's a convincing performance argument to have len() return the byte count rather than number of utf8 characters. The implementation of strings in Go is a 2-word struct containing a pointer to the start of the string and the length (in bytes). Under this implementation, len(s) is O(1) and RuneCountInString(s) is O(n). It makes sense to have the default case also be the fast one, particularly since people appreciate Go for its performance. Alternatively, you could store the rune-count in the 2-word struct to reverse the above runtimes. However, this is detrimental for the common operation of converting between []byte/string as well as writing a string to a buffer. Both of those operations are a simple memcpy with the actual Go implementation, but would be O(n) using this alternate implementation. Perhaps you could make it a 3-word struct that contains both byte-length and rune-length; Then all strings take up additional memory as well as requiring more overhead when used as function arguments. |
|