Hacker News new | ask | show | jobs
by 323 1416 days ago
This raises the question: how many 32-bit crc32 pipelines do modern AMD CPUs have?
1 comments

https://uops.info/html-instr/CRC32_R64_R64.html answers that for you. Zen2 and Zen3 same as Intel: latency 3, throughput 1. Older AMD chips less good.
Also amd avx2 machines faster than intel one but the current benchmarks probably didnt have hardware on hand, googles highway numbers also intel based, I wanted to test also, i got machine but having laziest days :)
But the question was how many execution ports it has, not what's the latency/throughput of one port.
Given latency 3 / throughput 1, the only reasonable implementations are: A) Three ports, each non-pipelined, taking 3 cycles B) One port, with three-cycle pipeline (each cycle, one instruction can enter the start of the pipeline, and anything in-progress moves forward one stage) Given that CRC isn't too hard to pipeline, and (B) requires less physical hardware, it is almost certainly (B).
From the port usage reported at https://uops.info/html-instr/CRC32_R64_R64.html, we can conclude (B) for the Intel microarchitectures. For AMD it's not entirely obvious, but I agree (B) appears more likely.