Hacker News new | ask | show | jobs
by secondcoming 1353 days ago
Was about to post the same sentiment:

C++ compare function:

    compare(double, double):     # @compare(double, double)
        ucomisd xmm1, xmm0
        seta    al
        ret

C compare function:

    compare:           # @compare
        movsd   xmm0, qword ptr [rdi]           # xmm0 = mem[0],zero
        movsd   xmm1, qword ptr [rsi]           # xmm1 = mem[0],zero
        movapd  xmm2, xmm0
        cmpneqsd        xmm2, xmm1
        movq    rcx, xmm2
        and     ecx, 1
        ucomisd xmm1, xmm0
        mov     eax, -1
        cmovbe  eax, ecx
        ret
1 comments

Nice data point. This is also relevant beyond qsort vs std::sort. Some implementations of Quicksort use a 3-way partition to save work for skewed inputs (if the pivot occurs often, no need to recurse for all those values).

We have found that in the SIMD context, those two comparisons are expensive enough to make this no longer worthwhile. We instead special-case subarrays with one and two unique values, which can be about 4x as fast on real data vs uniform-random data.