Hacker News new | ask | show | jobs
by __natty__ 25 days ago
I wonder when we reach speed of 1000 tps with high quality models. 5 years? 10 years?
2 comments

Don't set your goals so low. We already reached 17k on a small models.

Since the whole goal of software architecture schemes it to allow the rest of us non-geniuses to still understand it and modify it, perhaps the same could be true of llms.

Perhaps a million-per-second hypothetical (small) model can be more useful than a state of the art big one.

We technically can (check Cerebras grok and Gemini diffusion), but it's not economically viable and not a priority for product managers.

Maybe when intelligence plateaus it could become a main differentiating factor, like smartphones and battery life.