|
|
|
|
|
by ryao
579 days ago
|
|
_Pragma("GCC unroll 9")
while (nelems > 1) {
auto half = nelems / 2;
nelems -= half;
first += (comp(first[half - 1], value)) * half;
}
return first;
As long as we are sharing improvements, here is mine. Microbenchmarking numerous variations found this performed the best. If your arrays are smaller than the unrolling factor, you don’t even loop. You have exactly 1 branch that is executed exactly once in that case. This is the closest to branch free that a binary search can be.I devised this a couple years ago for use in ZFS’ btree code: https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a... Feel to have it under your choice of OSI approved license, including CC-0. |
|