Hacker News new | ask | show | jobs
by Findecanor 1590 days ago
As an asm geek, I wasn't surprised to read that taking advantage of the carry flag yielded the most efficient code for some processors. I recalled that some ISAs also have special SIMD instructions specifically for unsigned average, so I looked them up:

* x86 SSE/AVX/AVX2 have (V)PAVGB and (V)PAVGW, for 8-bit and 16-bit unsigned integers. These are "rounding" instruction though: adding 1 to the sum before the shift.

* ARM "Neon" has signed and unsigned "Halving Addition". 8,16 or 32 bit integers. Rounding or truncating.

* RISC-V's new Vector Extension has instructions for both signed and unsigned "Averaging Addition". Rounding mode and integer size are modal.

* The on-the-way-out MIPS MSA set has instruction for signed, unsigned, rounded and truncated average, all integer widths.

Some ISAs also have "halving subtraction", but the purpose is not as obvious.