|
|
|
|
|
by ActorNightly
8 days ago
|
|
Nope, lol. Large models still are quite far ahead, don't be fooled that even Gemma:31b (which is better than the 12b overall) is anywhere close to big models. There is definitely room for optimization, but fundamentally, for complex tasks, you need visible small gradients for accuracy that allow the model to be trained on (and consequently be followed during inference). For example, if you specify in instructions not to write code but ask coding question, Gemma will still write code. Whereas Gemini/Claude will pick up on that and follow your instructions better. |
|