Hacker News new | ask | show | jobs
by n2d4 920 days ago
Mixtral is on-par with Gemini Pro, not Gemini Ultra (and even there it is further behind Gemini Pro than Gemini Pro is behind GPT 3.5). But to directly answer your question, they are quite well-funded, having raised over $700mil to date. I definitely wouldn't count them out.
2 comments

Mixtral ranks higher than Gemini Pro on the (subjective) Chatbot Arena Leaderboard: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

Where are you seeing that it is "further behind Gemini Pro than Gemini Pro is behind GPT 3.5"?

Presumably in the very article this HN submission is for (https://arxiv.org/pdf/2312.11444.pdf), table 1.
Mixtral is missing in half of the benchmarks in that paper. Hardly conclusive. It’s also common knowledge that these benchmarks have a lot of issues[0]. A good litmus test, but not a substitute for actually seeing how the models do in the real world.

On the topic of “hardly conclusive” things, Gemini Pro literally told me just a few minutes ago[1] that the Avatar movies did not have humans in them. There was no funny business in the prompting. At least Mixtral knows that Avatar has humans in it. Most of Gemini Pro’s responses have been fine, but not exceptional.

[0]: one random article talking about these issues: https://www.surgehq.ai//blog/hellaswag-or-hellabad-36-of-thi...

[1]: https://i.imgur.com/En37EJD.png

Gemini Ultra is not out yet. With the same logic, you could compare an unreleased Mistral model with Gemini Ultra.
Right. I'm just pointing out that comparing one model with a distilled version of another and then making broad statements about the companies behind them isn't really useful.

Surely you could make a comparison of two unreleased models, but it wouldn't be interesting because you don't have any real data (and benchmarks don't really mean anything).

Debating the usefulness of hn commentary is a somewhat philosophical issue, but I think it's entirely fair to draw parallels between what is, not what might be.

Gemini Ultra is self-evidently not ready for production. What the issues are? Who knows, but in a game that as of right now is mostly about reducing the amount of brute force required, something as "simple" as not being efficient enough is actually not something to gloss over. If your engines entire stick is having the greatest graphics but you can't make it run at acceptable fps, well, then it's not actually a usable product.

A LLM that is not actually released could very well be in a comparably dire state and fixing it while also delivering on the promised performance might be entirely non-trivial.

Mistral “Medium” is available (in beta, via API) and should give better results than the “Small” mixtral model.