Hacker News new | ask | show | jobs
by jcranmer 2265 days ago
Micro-ops are the actual things that can be executed by the hardware. A floating-point FMA unit is going to support a floating point addition, subtraction, fused multiply add (with various intermediate sign twiddles), and integer multiplication and wide multiplication--all without adding much more hardware: you're adding a few xors or muxes to the big, fat multiplier in the middle of it all. Each of these might have distinct micro-ops, or you might be able to separate the processing stages and use a single multiplier micro-op with distinct preprocessing micro-ops for the different instructions. Realistically, though, you are adding new micro-ops, although the overall hardware burden may be light.

The motivation of adding new instructions is generally to get higher performance, so there's going to be pressure to have hardware to execute it well, as opposed to a more naive emulation. But sometimes people add support without making it fast--AMD chips used to (still do? I'm not sure) implement the 256-bit AVX instructions by sending the 128-bit halves through their units in sequence, so that it technically supported AVX instructions but didn't see any improved benefit from it.