|
|
|
|
|
by brucehoult
1022 days ago
|
|
LMUL (and especially fractional LMUL) isn't for performance, it's for kernels with mixed-element sizes, to maximise the number of variables (and elements) you can keep in registers without spilling. Being able to use LMUL as a way to get the effect of unrolling and hide the pointer bumps and loop control on simple loops on narrow processors, without expanding the code, is just a bonus. |
|