Y
Hacker News
new
|
ask
|
show
|
jobs
by
rhdunn
123 days ago
Mistral have small variants (3B, 8B, 14B, etc.), as do others like IBM Granite and Qwen. Then there are finetunes based on these models, depending on your workflow/requirements.
1 comments
karmasimida
123 days ago
True, but anything remotely useful is 300B and above
link
Eupolemos
123 days ago
That is a very broad and silly position to take, especially in this thread.
I use Devstral 2 and Gemini 3 daily.
link
ac29
121 days ago
Devstral 2 is 123B parameters. Thats less than 300B, but its still much larger than the 3-14B models GP was talking about.
link