Hacker News new | ask | show | jobs
by orbital-decay 5 days ago
It's not a tradeoff in this case, this is an optimized megakernel for the same model for better throughput. And no, in most cases accuracy can be sacrificed in favor of throughput or latency (assessing it automatically is the harder part).