Hacker News new | ask | show | jobs
by anematode 268 days ago
I don't think there's a single instruction to do this, but you could probably do it with a combination of shld + bzhi + cmov. rustc already seems to do a great job, and whatever I could come up with that assumes [src, src + len] is always in bounds isn't that much better.

Edit: https://godbolt.org/z/rrhW6T7Mc

1 comments

Hvala puno. Very interesting! I will see how my implementation compares in asm.
Cool :) Feel free to reach out as well. I should note that that link is optimized for variable lengths and offsets -- if your lengths and offsets are constant then it can be much more efficient and I'd expect rustc/LLVM to nail it.