|
|
|
|
|
by wyldfire
2447 days ago
|
|
> But I was able to work around this by using a trick: creating two variants of the function, one marked with #[inline(always)] (for the hot call sites) and one marked with #[inline(never)] (for the cold call sites). Can't PGO make inlining decisions like this? Otherwise, propeller/LTO might work well. > But there’s a trade-off. Sometimes a simpler, smaller function is slower. Without a doubt! Imagine the naive/simple/portable memcpy versus a target-aware one that capitalizes on wider or aligned loads and stores. |
|