Hacker News new | ask | show | jobs
by misterdabb 820 days ago
It's a weird graph... It's specifically tokens per GPU but the x-axis is "interactivity per second", so the y-axis is including Blackwell being twice the size and also the increase from fp8 -> fp4, note it will needs to be counted multiple time as half as much data is needed to be going through the networks as well.