| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by k2xl 697 days ago

This is great - Though I am confused on two things:

1. How is it possible that GPT-4o mini outperforms 3.5 turbo but 3.5 turbo is more expensive? Like why would someone use a worse model and pay more?

2. Why is the GPT4o vision and GPT4o-mini vision cost the same?

5 comments

petercooper 697 days ago

I might be wrong, but I've inferred from OpenAI's pricing behavior that they use it to encourage people to migrate to more efficient models. The 3.5 Turbo pricing is maintained to encourage you to stop using it. Look at davinci-002's pricing, for example - it's very high for something that's relatively ancient.

link

alach11 697 days ago

It's also very likely that 3.5-turbo is more expensive for them to run than gpt-4o-mini. Models are getting smaller and more efficient. They just keep 3.5-turbo around for legacy support.

link

hayksaakian 697 days ago

exactly. the only people who would use 3.5 now are people who MUST use it due to some specification, contract or requirement.

You can charge a premium to people who aren't allowed to change their mind.

link

observationist 697 days ago

Predictability with a particular set of prompts and processes. Over time, you'd migrate to the lower cost, higher performing model, as long as it can be at least as consistent as the higher cost model. People have built really weirdly intricate chains of dependency on things that particular models are good at, and sometimes 3.5 turbo can accomplish a task dependably where other models might refuse, or have too wide a variance to be relied on.

Over time, reliability and predictability will be much less an issue.

link

palisade 697 days ago

4o mini is more efficient so it costs them less than 3.5 turbo to host it.

link

Tiberium 697 days ago

1. It's not a worse model, it's a better model. Two years ago all we had was text-davinci-003, which is much, much worse than, for example, the current Claude 3.5 Sonnet which costs like 5x less.

link

laborcontract 697 days ago

regarding 1, they have a strong understanding of the tasks/queries their users are performing and they are pruning the model accordingly. It's like playing jenga but with neurons.

link