Hacker News new | ask | show | jobs
by AshleysBrain 1032 days ago
Most languages have a `max` function, so the core of the loop could be written with just something like: `maxV = max(maxV, v)`

That could be entirely branchless, right?

1 comments

A max function still compiles down to some kind of branch
I thought there were specific assembly instructions for this kind of thing, such as MAXSS in x86 [1], plus vector variants like SSE4 PMAXSD. Presumably it's possible the CPU can handle those with special branchless logic, depending on the compiler and CPU implementation. I guess you'd have to know about the CPU internals to know if the instruction is truly branchless, but it is branchless in the sense there is no conditional jump made in the assembly instructions.

[1] https://stackoverflow.com/questions/40196817/what-is-the-ins...

clang compiles all three of these functions to use max instructions (https://godbolt.org/z/9z7hGfdhq).

    #include <algorithm>
    
    using std::max;
    
    int max_array_func(int values[], size_t values_count)
    {
        int max_value = values[0];
        for (size_t j = 0; j < values_count; j++)
        {
            max_value = max(max_value, values[j]);
        }
        return max_value;
    }
    
    int max_array_bittwiddling(int values[], size_t values_count)
    {
        int max_value = values[0];
        for (size_t j = 0; j < values_count; j++)
        {
            int x = max_value;
            int y = values[j];
            // http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
            max_value = x ^ ((x ^ y) & -(x < y));
        }
        return max_value;
    }
    
    int max_array_branch(int values[], size_t values_count)
    {
        int max_value = values[0];
        for (size_t j = 0; j < values_count; j++)
        {
            if (max_value > values[j])
            {
                max_value = values[j];
            }
        }
        return max_value;
    }