|
|
|
|
|
by the8472
1382 days ago
|
|
My understanding is they load in weights occasionally into sram and then pump in training data on the sides of the die and have multiple cores operate on a wavefront of data. So the cores don't compete for host memory bandwidth because the same data flows (transformed) through multiple cores. |
|