Hacker News new | ask | show | jobs
by Tiberium 483 days ago
H800 is the export variant that they had access to. They directly reference it in the repo:

>Achieving up to 3000 GB/s in memory-bound configuration and 580 TFLOPS in computation-bound configuration on H800 SXM5, using CUDA 12.6.