|
> If anything, they seem to have thrown the bulldozer innovations out of the window That's basically the point of Zen, Bulldozer was an architectural dead-end that wasn't going anywhere. Besides, it's not like Intel have massively innovated since Sandybridge. Ivy, Haswell, Broadwell and Skylake are little more than successive perfections of the Sandybridge architecture. It's hard to tell from the slides, but it looks like Zen is a much wider architecture than Intel, with 10 execution ports (4 ALU, 2 AGU, 2 FP ADD, 2 FP MUL). Sandybridge had 6, Haswell and later have 8 execution ports. Bulldozer had 4 integer execution ports plus 2 float ports, which are shared between each pair of cores. The most interesting thing about those slides is the layout of the blocks marked "Scheduler". Intel chips all have a single scheduler, Bulldozer had one float scheduler (shared) and one integer scheduler. But I'm counting 7 schedulers on the Zen slides, one float scheduler managing the 4 floating point execution ports and 6 integer schedulers, one for each execution port. |
The text mentions it has to fuse the four FP ports to do a single 256-bit AVX per cycle. This is significantly less wide than Intel architectures (half/quarter). We can interpret the width thus as 4+2+1 ports, which is in the Haswell ballpark.
What is maybe more telling here is the 16-byte load/stores, Haswell is doing 32-byte at the same rate. It points to Zen abandoning FP bandwidth in both client and server. Perhaps they want to rely on GPGPU with the on-chip GPU to do compute workloads?
> The most interesting thing about those slides is the layout of the blocks marked "Scheduler". Intel chips all have a single scheduler, Bulldozer had one float scheduler (shared) and one integer scheduler. But I'm counting 7 schedulers on the Zen slides, one float scheduler managing the 4 floating point execution ports and 6 integer schedulers, one for each execution port.
Depends what they mean with Scheduler. If it means reservation stations for micro-ops, then that's already the case in other micro-architectures. If Scheduler means assigning micro-ops per port, than there can logically only be a single one.