| HN Mirror

It's most definitely not free. It'd consume fetch bandwidth, decode/rename/scheduler slots, an execution port etc.

The comparison here is:

((v ^ 0x303030) * 0x640a0100) >> (len << 3)

against:

table[(((v >> 12) | v) & 0xfff) | (len << 12)]

The former is 4 ops, the latter is 6 ops, so throughput wise, the former wins. Latency wise, it also wins, considering that L1 cache lookups are generally 3-5 cycles, whilst integer multiply is typically 3-4.