|
|
|
|
|
by chc4
2743 days ago
|
|
Every single C compiler worth mentioning will turn a switch statement into a jump table. The difference between a jump table and computed goto is only that one is `jmp [ecx+eax8]` and the other is `mov edx, [ecx+eax8]; jmp edx`. The later is faster because of weird branch prediction reasons, and no bounds checks in the switch. |
|
Only if the case values are small integers with a compact range. But if your opcodes are large and/or sparse (more common in non-VM scenarios) compilers don't optimize switch statements very well.
There's a ton of research on switch statement optimization, but most of it doesn't translate to real-world scenarios very well. Compilers will give up fairly quickly on trying to generate a jump table because the work necessary to map the case value to an index at runtime can easily cost more than its worth in many situations.
Here's some real-world code which you can easily benchmark and compare: https://github.com/wahern/hexdump/blob/master/hexdump.c#L752
Just flip the VM_FASTER macro. Using `hexdump -C < 64MB.random.file > /dev/null`, on my circa 2012 Mac Mini the switch statement is 80% slower than the computed goto and on my circa 2018 Macbook Pro it's about 15% slower. (See my note elsethread about the post-Haswell branch prediction.) And this is despite the fact that the opcode range is 0-32 without holes!
Also, with computed gotos you can remove the jump table altogether. You can make your opcodes actual addresses (just like native code) if you can make the label addresses visible to the code generator. (See my post elsethread for how to make them visible.)
P.S. While I implemented hexdump.c using a VM mostly for fun, the end result not only out performs GNU od and BSD hexdump, but IMHO the code is much easier to read, too.