|
|
|
|
|
by dzaima
476 days ago
|
|
Both Intel and AMD to some extent separate the vector ALUs and the register file in 128-bit (or 256-bit?) lanes, across which arithmetic ops won't need to cross at all. Of course loads/stores/shuffles still need to though, making this point somewhat moot. |
|