This appears to be incorrect. I have seen a cppcon talk about that. I think it may have been this one: https://www.youtube.com/watch?v=rHIkrotSwcc . On the other hand, it is a very, very tiny minority of programmers who have to care about an overhead as low as this. Most of us should not waste one microsecond of our thinking time on this.
Not when stepping through code in gdb/lldb in my experience in debug builds: they very often end up stepping in to the -> operator implementation for functions, which then requires me to step in again to get to the actual function call I wanted.
I'm not talking about code performance, I'm talking about human time of having to step through extra things while using a debugger to work out what's going on.
Sadly if the platform ABI is not ignored there is (usually? might be free on some platforms) overhead relative to a void* when passing one between functions.
Also there's the usual code size / compile time hazards of templates.
unique_ptr has zero overhead except when passing or returning by move. But I don't find myself doing that anywhere performance matters.
The overhead is not inherent, but is rather a consequence of C ABI choices inherited into C++ calling conventions. It is, though, real, and not easily fixed. E.g., RISC-V's ABI binding peobably suffers from the same overhead, not for any good reason, but just because the RISC-V crowd couldn't spare the attention to fix it.