|
|
|
|
|
by overfeed
3 days ago
|
|
> Wrong, mostly. > Model capability is a function of model size Model effectiveness has improved across model sizes. You really should try the latest flash variants more. They have become my default for most tasks except for gnarly high-level planning. |
|
A 2026 4B beats 2024 4B, but both are far behind the contemporary frontier. Which makes them bad. There is no such thing as "too much capability" - a "good" model is whatever the current frontier is.
In 2024, a "good" model is one that can be trusted to write a 800 line script. In 2026, it's a model that can be trusted to do gnarly high-level planning and execution both. In 2028, it's going to be something like a model you can point at an extremely involved task, abandon, and have it report back with a "done" in 3 weeks.