Hacker News new | ask | show | jobs
by rob74 2 days ago
Ok, then it will be an explosion of binary size, if you have several code blocks optimized for each architecture level - I'm not very familiar with the subject, but I imagine it would have to be relatively large chunks of code, otherwise the constant branching would eat up the speed advantage.
1 comments

These are usually pretty tight loops or constructs based on specific features.

An unspecialised popcnt is half the dozen instructions, for specialised versions it’s 4 implementations ranging from half a dozen to two dozen bytes.