|
|
|
|
|
by jedbrown
2139 days ago
|
|
A100 has a 20% edge on energy efficiency for HPL, along with higher intrinsic latencies. It's also 6-12 months behind A64FX in deployment. https://www.top500.org/lists/green500/2020/06/ HPCG mostly tests memory bandwidth rather than interconnect, but Fugaku does have a great network. Adding DRAM to a GPU-heavy machine has limited benefit due to the relatively low bandwidth to the device. They're effectively both HBM machines if you need the ~TB bandwidth per device (or per socket). Normalizing per node (versus per energy or cost) isn't particularly useful unless your software doesn't work well with distributed memory. |
|
This POWER10 chip under discussion has 1TB bandwidth to devices with expandable RAM.
Yeah, I didn't think it was possible. But... congrats to IBM for getting this done. Within the context of this hypothetical POWER10, 1TB bandwidth interconnects to expandable RAM is on the table.