Hacker News new | ask | show | jobs
by bitsandboots 457 days ago
Benchmarks of what? Memory speed matters for some things but not others. It matters a lot for AI training, but less for AI inference. It matters a lot for games too, but nobody would play a game on this or a mac.
1 comments

AI inference is actually typically bandwidth limited compared to training, which can re-use the weights for all tokens <sequence length> * <batch size>. Inference, specifically decoding, requires you to read all of the weights for each new token, so the flops per byte are much lower during inference!