| I'm not so sure that the right take-away is "hand-written assembler is 6x faster than C." It's more like "jumps are a lot slower than conditional arithmetic." And that can [edit:often] be achieved easily in C by simply not using switch statements when an if statement or two will work fine. Rewriting the C function as follows got a 5.5x speedup: int run_switches(char *input) {
int r = 0;
char c;
while (1) {
c = *input++;
if (c == 's') r++;
if (c == 'p') r--;
if (c == '\0') break;
}
return r;
}
Results: [16:50:14 user@boxer ~/looptest] $ gcc -O3 bench.c loop1.c -o lone
[16:50:37 user@boxer ~/looptest] $ gcc -O3 bench.c loop2.c -o ltwo
[16:50:47 user@boxer ~/looptest] $ time ./lone 1000 1
449000
./lone 1000 1 3.58s user 0.00s system 99% cpu 3.589 total
[16:50:57 user@boxer ~/looptest] $ time ./ltwo 1000 1
449000
./ltwo 1000 1 0.65s user 0.00s system 99% cpu 0.658 total
|
https://owen.cafe/posts/the-same-speed-as-c/
And as others have pointed out, you can tweak the input, then vectorize the algo, if you want to go that route.
I considered this a pedagogical exercise and I sincerely hope nobody will start dropping down to assembly without a very good reason to.