Hacker News new | ask | show | jobs
by astrange 1477 days ago
The front end (the decoder stage and branch predictors) are what would theoretically be important for compilers as they’re the bottleneck. But Intel’s optimization advice doesn’t say much about branches anymore, they pretty much want you to rely on them to take care of it.

That’s only part secrecy and part to give them freedom to change it. It is of course somewhat described in their patents.

1 comments

There are sometimes vague hints about things to avoid, e.g. putting too many branches on the same cache line, and they usually publish the size of their tables, typically 4K, 8K entries these days? But the actual predictors are wicked devils; they clearly are doing some tournament predictors, using tiny ML modules (perceptrons), and god knows what else. I studied this carefully when trying to make good Spectre gadgets, but it is very very difficult to 100% trick (or utilize!) a branch predictor these days--they just learn in interesting ways...and entries alias :-)

I honestly don't know if it's worth it to try to optimize branch prediction in compilers these days, beyond the obvious step of putting the highest probability target next (for fallthrough prediction) and generally laying out hot parts of the code together. TurboFan and most other dynamically-optimizing compilers put rare code at the end of functions, and that's a huge boost.