|
|
|
|
|
by jng
246 days ago
|
|
He is no Jim Keller, and the mostly[1] automated transcript makes it read cringe, but it is not at all devoid of content. Some examples of very interesting, non-obvious content: * Even if store ports are kept fixed (2 in his example), adding store address generators (up to 4 in his example) actually improves performance, because it frees up load port dependencies.
* Within the same core, they use two different styles of load/address address contention mechanisms which he describes as two tables, one with explicit "allows" and the other one with explicit "denies" -- which of course end up converging (I understand it refers to two different encodings which vary in what is stored).
* Between cores, they have completely separate teams which reach different designs for things like this.
* It was interesting to me to discover how isolated the different core design teams work (which makes sense)
* It was interesting to me to picture the load/store address contention subsystem, which must be quite complex and needs to be really fast. And I stop listing, re different types of workloads, gaming workloads being similar to DB workloads, and even more similar between them than to SPEC benchmarks and so on. Just go read the interview if you're interested in CPU design! [1] mostly automated: at least the dialog name labels seem to be hand-edited, as one of them has a typo |
|
What made the transcription "cringe"? I'd like to believe it's accurate.