|
|
|
|
|
by CuriouslyC
46 days ago
|
|
Parameter size gets you world knowledge and better persistence of behavior as context grows. Both of those things can be engineered around to a large degree, and the latest Qwen models show that small models can be quite smart in narrow domains and short time windows. |
|