Hacker News new | ask | show | jobs
by PhunkyPhil 8 days ago
Obligatory taalas mention:

https://taalas.com/

Despite the performative UI components they have a shipped (demo) product:

https://chatjimmy.ai/

This is only 3.1 8B and a very small context window, but at 17k tokens per second it's likely enough to reliably call tools which would make a huge difference in agentic applications. Assuming they can bake in better models I'm just as bullish or even moreso on this, considering this opens up edge computing at the extremely low power requirement.

High tok/s is the future IMO.

1 comments

My dream is claude or codex running at this speed.
More realisticly, I hope qwen 3.6 27B on taalas.