|
|
|
|
|
by dalemhurley
126 days ago
|
|
1000 tokens/sec for a highly specialised model is where we are going to see agents requiring. Dedicated knowledge, fast output, rapid iteration. I have been trying out SMOL models as coding models don't need to the full corpus of human history. My most recent build was good but too small. I am thinking of a model that is highly tuned to coding and agentic loops. |
|