|
|
|
|
|
by atq2119
1164 days ago
|
|
Take a look at Agner Fog's https://www.agner.org/optimize/instruction_tables.pdf For example, Zen4 64-bit DIV is listed as: 2 uOps, 10-18 cycles latency, 7-12 cycles inverse throughput. This suggests uOps with variable execution lengths, i.e. iteration happening in the execution unit and not just a fixed unrolled loop streamed by the microcode part of the frontend. You may be right that there were some CPUs that did the fixed unrolling, but it doesn't seem that common. |
|