|
|
|
|
|
by UncleOxidant
27 days ago
|
|
> It is almost guaranteed that a 60-90B model can outperform current SOTA in coding tasks within 2-3 years Given how well Qwen3.6-27B performs for such a small model I think you could be right. I suspect that Google,OpenAI,Anthropic must be looking at the Qwen3.6 models (as well as Deepseek V4-flash, MiMo-V2.5) and wondering if they could make some smaller models that are specifically trained for certain activities - like coding. Smaller, more targeted models would take up a lot less resources. |
|