Hacker News new | ask | show | jobs
by Calavar 1206 days ago
I would expect that std::string_view would still be significantly faster. Copying or moving an std::string with small string optimization is likely going to boil down to a branch (to check if the instance is using the small string optimization) and a memcpy. As opposed to copying or moving an std::string_view, which should be two MOV instructions.
2 comments

string view is theoretically going to be faster, but its the type of thing you'd really need to profile in actual context to see to what extent that is true or not. I was mainly just pointing out that (small) strings are actually way faster than people would think since they don't actually need to allocate memory

if I was actually tasked with hyper-optimizing a tokenizer I would probably skip past string view and do a pair of U16 indexes instead assuming the input file is less than 65k characters [with a "slow path" that uses U32 instead]. I just think that its probably not actually going to be a whole order of magnitude faster than just using string (unless there's long tokens)

I don’t see why moving std::string needs to branch. You just copy the source and then zero it out, unconditionally.
It depends on your STL implementation's representation of string: https://godbolt.org/z/nMYGYoWbq

* libstdc++ has an internal reference to its own address for the SSO. If the moved-from string was referencing its SSO buffer, the moved-to string needs to use its own address. The branch is differentiating the SSO state from a heap-allocated state.

* libc++ string move can be implemented this way, but the branch ends up happening on access to the string. It still needs to discard the old heap allocated buffer, if need-be as well.