Hacker News new | ask | show | jobs
by m1el 3295 days ago
But using the same set of instructions, z-order encoding and decoding is 8 instructions (5 if you exclude size conversion and return):

    zorder64_inv:
        movabsq $0x5555555555555555, %rax
        pextq   %rax, %rcx, %rdx
        shrq    %rcx
        pextq   %rax, %rcx, %rcx
        shlq    $32, %rcx
        movl    %edx, %eax
        orq     %rcx, %rax
        retq

    zorder64:
        movl    %ecx, %eax
        movabsq $0x5555555555555555, %r8
        pdepq   %r8, %rax, %rcx
        movl    %edx, %eax
        pdepq   %r8, %rax, %rax
        addq    %rax, %rax
        orq     %rcx, %rax
        retq
1 comments

Nice! Now I wonder when 36 vs 8 machine instructions become a bottleneck. I have seen applications of space-filling curves in quasi Monte Carlo integration, it could be potentially significant there.