Hacker News new | ask | show | jobs
by qsera 6 days ago
Tokens per seconds is the "Megapixels" of AI marketing!
2 comments

I mean, sure, in the sense that they're a real and meaningful number for most of the spectrum on offer, and only gets silly when the number gets too high? There's a pretty big usability difference between 10t/s and 100t/s, and I can imagine similarly for 100->1000. I don't know about > 1000, but let's not pretend that the number is meaningless.
It is pretty meaningless for something that calls itself intelligent.
Definitely not, there's a ton of potential realtime use cases and high throughput/low TTFT is exactly what they need.
Of course, megapixels are also useful if you want to print large sizes.
Completely incomparable. Large printing is a narrow niche in art and technical photography, part of which is already covered by composites, and pixel size is a physical tradeoff for sensors. Cases for reasoning at realtime speeds are much, much more diverse, infinitely more diverse than anything we're currently using the big models for. Consider the fact that large models don't necessarily imply language. Speed is the major limiting factor for high-level automation. Coding is simply the immediate killer app that is useful right now, given the current state of AI - just like roleplaying and chatbots were previously.
> Speed is the major limiting factor for high-level automation.

Yes, but the point is the quality of inference is more important than speed. What good is speed if inference is shit?

It's not a tradeoff in this case, this is an optimized megakernel for the same model for better throughput. And no, in most cases accuracy can be sacrificed in favor of throughput or latency (assessing it automatically is the harder part).