|
|
|
|
|
by dzaima
747 days ago
|
|
More importantly for the "2 cycles" question, Zen 4 can get one cycle latency for double-pumped 512-bit ops (for the ops where that's reasonable, i.e. basic integer/bitwise arith). Having all 512-bit pipes would still be a massive throughput improvement over Zen 4 (as long as pipe count is less than halved), if that is what Zen 5 actually does; things don't stop at 1 op/cycle. Though a rather important question with that would be where that leaves AVX2 code. |
|
What would be different between doubling tbe pipe width vs number of pipes? (excluding inter lane operations that already had their own 512-bit pipe in Zen4)