Hacker News new | ask | show | jobs
by YourDadVPN 43 days ago
Fair point that binary search causes a lot of branch mispredictions. Did you ever measure whether the unrolled linear search with 8 elements outperforms a "rolled" one? Because with instruction level parallelism I wonder how much difference removing 8 mostly easily predicted (in the linear search) branches makes.
1 comments

I didn't try, but loop contains a branch so there could be misprediction, which doesn't happen with unrolled loop. Also, instruction level parallelism applies to unrolled code as well.