|
|
|
|
|
by mort96
282 days ago
|
|
Typically, making it possible for the compiler to decide whether or not to inline a function is going to make code faster compared to disallowing inlining. Especially for functions like strcpy which have a fairly small function body and therefore may be good inlining targets. You're right that there could be cases where the inliner gets it wrong. Or even cases where the inliner got it right but inlining ended up shifting around some other parts of the executable which happened to cause a slow-down. But inliners are good enough that, in aggregate, they will increase performance rather than hurt it. > Even with LTO today you're talking 2-3% overall improvement in execution time Is this comparing inlining vs no inlining or LTO vs no LTO? In any case, I didn't mean to imply that the difference is large. We're literally talking about a couple clock cycles at most per call to strcpy. |
|
And at runtime, there is no meaningful difference between strcpy being linked at runtime or ahead of time. libc symbols get loaded first by the loader and after relocation the instruction sequence is identical to the statically linked binary. There is a tiny difference in startup time but it's negligible.
Essentially the C compilation and linkage model makes it impossible for functions like strcpy to be optimized beyond the point of a function call. The compiler often has exceptions for hot stdlib functions (like memcpy, strcpy, and friends) where it will emit an optimized sequence for the target but this is the exception that proves the rule. In practice, the benefits of statically linking in dependencies (like you're talking about) does not have a meaningful performance benefit in my experience.
(*) strcpy is weird, like many libc functions its accessible via __builtin_strcpy in gcc which may (but probably won't) emit a different sequence of instructions than the call to libc. I say "probably" because there are semantics undefined by the C standard that the compiler cannot reason about but the linker must support, like preloads and injection. In these cases symbols cannot be inlined, because it would break the ability of someone to inject a replacement for the symbol at runtime.