|
|
|
|
|
by BoorishBears
331 days ago
|
|
In the BERT era of language models, it was normalized that to get the best performance for a task, you probably needed targeted post-training As models got bigger and instruction following got better, everyone jumped on the general capabilities of the model + prompting We're approaching wall that needs to be overcome with a completely new and unheard of breakthrough, otherwise we're going to have to go back to specialized post-training (which lends itself to vertical solutions) I think people are seeing that now with stuff like Devstral being posttrained specifically for OpenHands and massively over-performing for its size at agentic coding |
|