Hacker News new | ask | show | jobs
by phkahler 381 days ago
Intel is also supposed to introduce the new APX instructions which include a bunch of instructions that duplicate existing ones but don't set any flags. The only plausible reason to add these is for performance reasons.
1 comments

This isn't just due to the actual dependencies of flag instructions at hardware level (although likely be a factor), it also majorly affects code layout. On Arm64 for example, you can make a comparison, do other operations, and then consume the result of that comparison afterwards, which is excellent for the pipeline and OoO engine. However, because most instructions on x86_64 write flags, you can't do this, and so you are forced to cram `jcc`/`setcc` instructions right after the comparison, which is less friendly to compilers and the OoO engine
OoO should actually be the care where that doesn't matter I'd think - the CPU can, well, execute the instructions not in the order they're in the binary; it's in-order implementations are where that matters more.

And with compare & jump being adjacent they can be fused together into one uop, which Intel, AMD, and Apple Silicon all do.