|
|
|
|
|
by 0vermorrow
331 days ago
|
|
I'm eagerly awaiting for Qwen 3 coder being available on Cerebras. I run plenty of agent loops and the speed makes a somewhat interesting difference in time "compression". Having a Claude 4 Sonnet-level model running at 1000-1500 tok/s would be extremely impressive. To FEEL THE SPEED, you can either try it yourself on Cerebras Inference page, through their API, or for example on Mistral / Le Chat with their "Flash Answers" (powered by Cerebras). Iterating on code with 1000 tok/s makes it feel even more magical. |
|
It will also impact how we work: interactive IDEs like Cursor probably make more sense than CLI tools like Claude code when answers are nearly instant.