Hacker News new | ask | show | jobs
by rostayob 702 days ago
gpderetta is right -- test/cmp + jump will get fused.

uiCA is a very nice tool which tries to simulate how instructions will get scheduled, e.g. this is the trace it produces for sum3 on Haswell, showing the fusion: https://uica.uops.info/tmp/75182318511042c98d4d74bc026db179_... .

1 comments

It's cool, I would love to have this for ARMv8 Mac
The LLVM project has a tool called llvm-mca that does this. Example: https://gcc.godbolt.org/z/7zcova1ce

The version in the Compiler Explorer wouldn't work on AArch64 without an -mcpu flag and I didn't know what to pass, so I copied -mcpu=cyclone from https://djolertrk.github.io/2021/11/05/optimize-AARCH64-back.... You'd have to look up the correct one for your Mac's CPU.