Hacker News new | ask | show | jobs
by sometimelurker 35 days ago
> ... didn't find any of their models remotely useful.

I daily drive Ministral-3-3B-Reasoning, getting a nice 70 tok/s on my GPU. its actually much faster than Google/DDG searches and I use it a ton when coding to check methods and remember ways to implement things. Its better than the qwen stuff of the same size because its not filled with ccp bs.

1 comments

>because its not filled with ccp bs

You do know that mistral distilled from deepseek, yeah?