Hacker News new | ask | show | jobs
by RaisingSpear 545 days ago
For CPUs that support AVX-512 VBMI, there's a faster reciprocal-based approach: https://avereniect.github.io/2023/04/29/uint8_division_using...

A VBMI2 example implementation can be found in the function named divide_with_lookup_16b: https://godbolt.org/z/xdE1dx5Pj