I think the problem is with branches misprediction. Binary and linear searches use a lot of unpredictable branches that ruin performance. No-branch versions of search work faster. I wrote the details here: https://news.ycombinator.com/item?id=47983102