Hacker News new | ask | show | jobs
by leni536 1337 days ago
llvm-mca prefers the "fold" over the "min". Intuitively it's more pipelinable. The codegen for "min" has data dependencies that are not there for the "fold" version.

https://godbolt.org/z/1zjKP6z8q

edit:

A mixture of the two approaches possibly outperforms both:

https://godbolt.org/z/WxGWPvYTs

edit2:

Doh, of course to be be correct for the min approach, you need to use unsigned. Does not change the performance though.