I thought there were specific assembly instructions for this kind of thing, such as MAXSS in x86 [1], plus vector variants like SSE4 PMAXSD. Presumably it's possible the CPU can handle those with special branchless logic, depending on the compiler and CPU implementation. I guess you'd have to know about the CPU internals to know if the instruction is truly branchless, but it is branchless in the sense there is no conditional jump made in the assembly instructions.