Hacker News new | ask | show | jobs
by sischoel 1871 days ago
Although it is possible to create branchless binary search. Here is one article: https://schani.wordpress.com/2010/04/30/linear-vs-binary-sea..., there are probably other ways to do that.

Another interesting question where one could sink a lot of time is how do to binary search on GPU's using multiple threads. There branching in multiple threads at the same time is also bad, but for slightly different reasons.

1 comments

On a GPU you could do a parallel n-ary search. I.e. instead of making 1 test in the center, make n tests. E.g. 1 round of 32-ary search should be equivalent of 5 rounds of binary search. Not only do you avoid the branch predictor, but the memory access is also parallel, which isn't true in branchless binary search. If the width of the search is larger than a warp/wavefront (32/64), then you need to add synchronization and it might not be faster anymore.

GPUs aren't really built for latency though and this will waste a lot of accesses on speculation. You are probably better off with a good btree.