Hacker News new | ask | show | jobs
by revelation 3083 days ago
retpoline is just a convoluted way of doing an indirect jump/call designed to make branch prediction entirely useless. It's a novel concept because doing this is completely opposite to making a program run faster.

Here is an example of the most common programming patterns that end up causing indirect jumps/calls:

https://godbolt.org/g/eThmnG

Imagine every virtual function call in a C++ program being mispredicted and taking twice as long.

(Instead of forcing us to recompile the world, maybe Intel should just disable branch prediction in microcode.)

1 comments

> Imagine every virtual function call in a C++ program being mispredicted and taking twice as long.

> (Instead of forcing us to recompile the world, maybe Intel should just disable branch prediction in microcode.)

Wouldn't the performance impact be dramatic ? In this[1] example there's a 6 times slowdown between situation with and without correct branch prediction.

[1]: https://stackoverflow.com/questions/11227809/why-is-it-faste...

I don’t think this is about all branch prediction—just about branch jump prediction. Like, “jump to %rax, but don’t try to guess what %rax is before you’re 100% certain.” Not the same as “jump to a known location if you think this here register is true/false”. As far as I can piece together, the exploit relies on making the branch predictor think the branch target will be somewhere you stored malicious code , which will then be executed by another process, e.g. a kernel. If it does harm before the branch predictor catches that it was wrong, you’re home free.

I’m not sure, but that’s what it looks like, so far.

This doesn't affect branches, only indirect jumps (and calls). The performance impact will still be considerable. It will make PGO more crucial (or smarter JITting, for VM languages) since the penalty can be avoided by prefixing an indirect call with a direct call to the most likely target - this is a well-known technique, useful on machines with weaker jump predictors than branch predictors.

Quite possibly, the worst affected code will be OO code that is dynamically (open) polymorphic.

FWIW, all branches will need to be followed by a memory load fence which seems to trip up speculative execution on Intel CPUs.
There is still branch prediction for normal calls or jumps, which are the majority and should be in performance conscious code.

It's just that some language features such as virtual functions in C++ often require indirect invocation when the compiler can't devirtualize a call, and there is lots of it in the kernel in performance-critical paths (think interrupts, syscalls).

There are two paragraphs dedicated to performance impact in the linked PR.