| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by winwang 582 days ago

4090 tensor performance (FP8): 660 teraflops, 1320 "with sparsity" (i.e. max theoretical with zeroes in the right places).

But at these levels of compute, the memory/interconnect bandwidth becomes the bottleneck.