Sorry for the confusing example. The bitwise one is correct since I store the binary in u64 with a different endian from the C++ version. (this happens because the C++ version is using a numpy script to do the preprocessing) My bad, I should explain it in a better way. Will update the post.