Hacker News new | ask | show | jobs
by tos1 64 days ago
That’s a pretty non-standard H200 configuration. In the regular HGX configurations, a node with 8xH200 has that much CPU DRAM. That makes the title of the paper somewhat arguable imo.