I noticed it is very fast when I was experimenting with algorithms for unbiased bounded random numbers https://dotat.at/@/2022-04-20-really-divisionless.html - the fancy nearly and really divisionless algorithms had much less advantage on an Apple M1 than on an old Intel CPU.