Hacker News new | ask | show | jobs
by brucehoult 1022 days ago
LMUL (and especially fractional LMUL) isn't for performance, it's for kernels with mixed-element sizes, to maximise the number of variables (and elements) you can keep in registers without spilling.

Being able to use LMUL as a way to get the effect of unrolling and hide the pointer bumps and loop control on simple loops on narrow processors, without expanding the code, is just a bonus.