|
|
|
|
|
by eklitzke
1082 days ago
|
|
Rearranging branches (and perhaps blocks too?) will definitely be done if you are building using FDO, because without FDO (or PGO) the compiler has no idea how likely each branch is to be taken. Cmov can also be enabled by FDO in some cases. However, whether or not using cmov is effective compared to a regular test/jump is highly dependent on how predictable the branch is, with cmov typically performing better when the branch is very unpredictable. Since they got a 6x speedup with cmov, I assume that their test input (which isn't described in the post, and is also not in their GitHub repo) consists of random strings consisting almost entirely of s and p characters. There's nothing wrong with this, but it does make the post seem a little misleading to me, as their clever speedup is mostly about exploiting an unmentioned property of the data that is highly specific to their benchmark. |
|
test code is here: https://github.com/414owen/blog-code/blob/master/02-the-same... it randomly selects between 's' or 'p'. The characters can't be anything other than 's', 'p', or the terminating null. Knowing that particular fact about our input gives us this ...clever... optimization:
which compiles to: This is too clever by half, of course, but it perfectly illustrates your point about exploiting properties of the data.