|
I don't think copy elision[1,2] means what you think it means, it simply allows the compiler to avoid e.g. allocating a new string when returning a string, or avoid allocating a new string to store the result of a temporary. That is, copy elision allows std::string str = data.substr(2, 3);
return str;
to only allocate one new string (for the return value of substr), instead of two. There's no way the compiler can get out of constructing at least one std::string for the return value, especially if there's any form of dynamic substr'ing (e.g. parsing a CSV file with columns that aren't all the same width).Sharing is only multithreading unfriendly if there's modification happening, and modification of textual (i.e. Unicode) data is bad practice and hard to get right, since all Unicode encodings are variable width (yes, even UTF-32, it is a variable width encoding of visible characters). Furthermore, a string_view is strictly better than a string for many applications, since a string_view can always be copied into a string by the caller if necessary (i.e. each function can choose to return the most sensible/most performant thing, which is a string_view if it's just a substring of one of the arguments). The only sensible argument against string_view in C++ I know is: it's easy to get dangling references. Which is correct, but that's a general problem with C++ itself, not with the idea of string views (Rust has a perfectly safe version in the form of &str, which cannot become dangling like in C++). > Keep in mind that on a 64-bit architecture a view is at least 16 bytes large and that small strings can be copied to the stack resulting in better locality and reduced memory usage. No, a string_view points into memory that already exists, there's no increased memory usage; a small string copied on to the stack will be part of the string struct, which is at least 3 * 8 = 24 bytes: a pointer, the length and the capacity. Also, a memcpy out of the original string is always going to be more expensive than just getting the pointer/length (or pair of pointers) for a string_view, since the memcpy has to do this anyway. [1]: http://en.wikipedia.org/wiki/Copy_elision [2]: http://definedbehavior.blogspot.com/2011/08/value-semantics-... |
Sharing is only multithreading unfriendly if there's modification happening, modification of textual (i.e. Unicode) data is bad practice and hard to get right
Read-only access to data indeed scales "infinitely" on modern architectures.
No, a string_view points into memory that already exists,
Yes. Right. How do you store that? You need at least one pointer and and an int or two pointers. That 16 bytes. Memcpy for a couple of bytes is very quick when it's stack to stack thanks to page locality.
Also, if you are using pointers you will have aliasing issues which will have an impact on performance. If you work by values you allow the compiler to optimize things better.
For small strings string view are just dumb and "most of the time" strings are very small.
To give a better example of why working a string view is both a bad idea and dangerous, it's as if you said "I don't want to copy this vector, therefore I will work on iterators". That's obviously a bad idea.