| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dalemhurley 126 days ago

1000 tokens/sec for a highly specialised model is where we are going to see agents requiring.

Dedicated knowledge, fast output, rapid iteration.

I have been trying out SMOL models as coding models don't need to the full corpus of human history.

My most recent build was good but too small.

I am thinking of a model that is highly tuned to coding and agentic loops.