|
|
|
|
|
by lmeyerov
2359 days ago
|
|
The restriction to a tiny GPU workload is increasingly wrong for assessments. GPU compute stacks are increasingly geared towards multi-gpu/multi-node & streaming, esp. given the crazy bandwidth they're now built for (2TB/s for a dgx2 node?). Likewise, per-GPU memory and per-GPU-node memory is going up nicely each year (16-24GB/GPU, and 100GB-512GB/node with TBs connected same-node). Network is more likely to become the bottleneck if you saturate that, not your DB :) Though I like to do mostly single gpu streaming in practice b/c I like not having to think about multinode and they're pretty cheap now :) |
|