Hacker News new | ask | show | jobs
by Jackson__ 197 days ago
So they spent all of their R&D to copy deepseek, leaving none for the singular novel added feature: vision.

To quote the hf page:

>Behind vision-first models in multimodal tasks: Mistral Large 3 can lag behind models optimized for vision tasks and use cases.

1 comments

Well, behind "models" not "langual models".

Of course models purely made for image stuff will completely wipe it out. The vision language models are useful for their generalist capabilities