My junior (at the time) coworker created the 64bit implementations by reading the original papers. But you should be able to find other implementations on github now, since the paper has been out for a while. Our code is pretty tied to software (our allocation / buffer handling).
In hindsight, it might not have been the best idea to say. Hey you just started working with C++ and never worked with SIMD, so yeah... I need to extend this SIMD Fast PFOR scheme to 64bit. But he ended up doing quite well.
In hindsight, it might not have been the best idea to say. Hey you just started working with C++ and never worked with SIMD, so yeah... I need to extend this SIMD Fast PFOR scheme to 64bit. But he ended up doing quite well.