Hacker News new | ask | show | jobs
by qeternity 115 days ago
Number of parameters is at least a proxy for model capability.

You can achieve incredible tok/dollar or tok/sec with Qwen3 0.6b.

It just won't be very good for most use cases.

1 comments

Model capability is the other axis on their chart. So they could have put Qwen 0.6b there, it would be in the bottom right corner.

I know what they are trying to do. They are attempting show a kind of pareto frontier but it’s a little awkward.