|
|
|
|
|
by otabdeveloper4
43 days ago
|
|
> higher param count models will remain smarter for a looong time They're not smarter, they just know more stuff. You probably don't need knowledge about Pokemon or the Diamond Sutra in your enterprise coding LLM. The "smarts" comes from post-training, especially around tool use. |
|
That's one of the biggest remaining head-scratchers in this whole business. You do need all that unrelated stuff to make a good coding model.
Nobody knows why you can't build a coding model by training on nothing but code, CS texts, specifications, and case studies, but so far it appears that you can't.