Hacker News new | ask | show | jobs
by avereveard 916 days ago
Batching changes that equation a fair bit. Also these cards will not consume full power since llm are mostly limited by memory bandwidth and the processing part will get some idle time.