Hacker News new | ask | show | jobs
by CalChris 2333 days ago
"This results in a loss of a single cycle at the time of instruction fetch." Maybe on that paper CPU but branch mis-predicts on Skylake are 16.5 cycles if there's a μop cache hit and 19-20 cycles if there isn't.

https://www.7-cpu.com/cpu/Skylake.html

That said, I didn't know about using BPM to access the PMC performance registers.