Hacker News new | ask | show | jobs
by fear91 1034 days ago
New Intel/AMD CPU's do a register based popcount in a single clock.
1 comments

Used to be three cycles.

Unfortunately, the original AMD64 back in 200x lacked a popcount, so most software built for PCs even today lacks any instances of the instruction. Means to get the instruction generated are finicky, non-portable, and often result unexpectedly in a function call, instead. E.g., without a "-m" option, Gcc and Clang will turn "__builtin_popcount()" into a function call. Likewise, "std::popcount()" and "std::bitset<>::count()". Always use at least "-mavx".