Hacker News new | ask | show | jobs
by qsera 8 days ago
> Speed is the major limiting factor for high-level automation.

Yes, but the point is the quality of inference is more important than speed. What good is speed if inference is shit?

1 comments

It's not a tradeoff in this case, this is an optimized megakernel for the same model for better throughput. And no, in most cases accuracy can be sacrificed in favor of throughput or latency (assessing it automatically is the harder part).