Hacker News new | ask | show | jobs
by wtallis 534 days ago
I believe the throughput shown in those tables is the total throughput for the whole CPU core, so it isn't immediately obvious which instructions have high throughput due to pipelining within an execution unit and which have high throughput due just to the core having several execution units capable of handling that instruction.
1 comments

That's true, but another part of the tables show how many "ports" the operation can be executed on, which is enough information to concluded an operation is pipelined.

For example, for many years Intel chips had a multiplier unit on a single port, with a latency of 3 cycles, but an inverse throughput of 1 cycle, so effectively pipelined across 3 stages.

In any case, I think uops.info [1] has replaced Agner for up-to-date and detailed information on instruction execution.

---

[1] https://uops.info/table.html

Shame it doesn't seem to have been updated with Arrow Lake, Zen 5 and so on yet.
Yes. In the past new HW has been made available to the uops.info authors in order to run their benchmark suite and publish new numbers: I'm not sure if that just hasn't happened for the new stuff, or if they are not interested in updating it.