Hacker News new | ask | show | jobs
by mgaunard 714 days ago
To me the duff's device is just a mechanism to unroll a loop without having to duplicate the code for the trailing case.

While you can't use SIMD you can still benefit from instruction-level parallelism.

It's potentially better in some scenarios where you want to minimize instruction cache usage and there are few iterations of the loop.