|
|
|
|
|
by mort96
282 days ago
|
|
And there's a performance cost to that. If there was only one implementation of strcpy and it was the version that happens to be picked on my particular computer, and that implementation was in a header so that it could be inlined by my compiler, my programs would execute faster. The downside would be that my compiled program would only work on CPUs with the relevant instructions. You could also have only one implementation of strcpy and use no exotic instructions. That would also be faster for small inputs, for the same reasons. Having multiple implementations of strcpy selected at runtime optimizes for a combination of binary portability between different CPUs and for performance on long input, at the cost of performance for short inputs. Maybe this makes sense for strcpy, but it doesn't make sense for all functions. |
|
You can't really state this with any degree of certainty when talking about whole-program optimization and function inlining. Even with LTO today you're talking 2-3% overall improvement in execution time, without getting into the tradeoffs.