Hacker News new | ask | show | jobs
by junrushao1994 1038 days ago
LLM decoding is dominated by memory bandwidth, and 3090Ti and 4090 happen to have the identical theoretical memory bandwidth