Y
Hacker News
new
|
ask
|
show
|
jobs
by
vessenes
265 days ago
Roughly 1/10 the cost of Opus 4.1, 1/2 the cost of Sonnet 4 on per token inference basis. Impressive. I'd love to see a fast (groq style) version of this served. I wonder if the architecture is amenable.
2 comments
petesergeant
265 days ago
Cerebras are hosting other Qwen models via OpenRouter, so probably
link
aitchnyu
265 days ago
Isnt it a 3x rate difference? 0.7$ for Qwen3-VL vs 3$ for Sonnet 4?
link
vessenes
264 days ago
Openrouter had $8-ish / 1M tokens for Qwen and $15/M for Sonnet 4 when I checked
link