Hacker News new | ask | show | jobs
by IgorPartola 107 days ago
The issue isn’t 5.4 > 5.2 etc. It is that there is a second dimension which is the model size and a third dimension which is what it is tuned for. And when you are releasing so quickly that flagship your instant mini model is on one numerical version but your flagship tool calling mini model is on another it is confusing trying to figure out which actual model you want for your use case.

It’s not impossible to figure out but it is a symptom of them releasing as quickly as possible to try to dominate the news and mindshare.

1 comments

> The issue isn’t 5.4 > 5.2 etc. It is that there is a second dimension which is the model size and a third dimension which is what it is tuned for.

All 3 models are tuned for general purpose work.

Model size isn’t how you pick which model to use. You pick based on performance in evals compared to price.

It’s not hard to imagine that the more expensive models are probably larger or having higher compute requirements.