Hacker News new | ask | show | jobs
by cornholio 1205 days ago
> ((table[0] >> nlz) << 2) + ((table[1] >> nlz) << 1) + (table[1] >> nlz)

> Which is five shifts and two additions, so more work plus lookup.

Something like that, perhaps with more bitmasking and less addition. There is no lookup, just three constants loaded into different registers and handled by different out of order execution units, then combined in a final addition / or-ing. So what you will "see" on the critical path is two bitshifts and three bitwise ORs/ANDs.

It's not faster, the point is that the speedup of "binary division" is probably not worth the day or so spent developing it unless in extreme cases.

1 comments

Ah, right, I forgot the &1 masks. And I guess that with three literals and OOOE it might be faster, yes.

But yes, extremely unlikely to be a hot path in most situations.