So why does conventional wisdom say that compilers will, in the vast majority of the time, outperform programmers doing assembly by hand? It seems contradictory to me.
In that context, it's not very small, it's 20% (all instructions are register-to-register instructions, so they all have the same weight). It's huge.
Yes, there's the possibility that ecx is used elsewhere, and in that case, my second comment is irrelevant, because I was answering to the possibility that such big wart is to be expected from compilers because they crop up regularly.
But then again, it's unlikely that it's used elsewhere, because eax has the return value of the C snippet, there's nothing else to do, the function can return. So the original question remains: did this come from a C compiler? If yes, it's crappy code.
> In that context, it's not very small, it's 20% (all instructions are register-to-register instructions, so they all have the same weight). It's huge.
Do they? I put together two quick and dirty nonsense test programs this is option2:
int main (void) {
for (int i = 0; i < 1000000000; ++i) {
asm volatile (
".intel_syntax\n"
"mov eax, edi\n"
"sar eax, 31\n"
"add edi, eax\n"
"xor eax, edi\n"
:::);
}
return 0;
}
option1 has the extraneous mov ecx, eax, and then add with ecx.
I confirmed with objdump -d that the assembly hadn't been touched and that the loops were the same. On my otherwise mostly idle dual L5640 system and pinned to a single cpu (just in case), option1 consistently runs in 3.14 seconds and option2 consistently runs in 3.15 seconds.
Adding an extra zero, both option1 and option2 runs in 30.94-30.95 user seconds. The extraneous move doesn't seem to cost any actual time.
But rules of thumb are like this. If you know enough to question the rule of thumb, go ahead. Hand assembly in hot code can be worth the cost.
It's also possible the value in ecx is used again outside the snippet?