| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by refulgentis 481 days ago

That's a lot of other stuff, and you express disagreement.

I'm sure we both agree it's the first model at this scale, hence the price.

> It's not really the beginning (1.0) of anything

It is a LLM w/o reasoning training.

Thus, the public decision to make 5.0 = 4.5 + reasoning.

> "more like the end...the last scale-up pre-training experiment."

It won't be the last scaled-up pre-training model.

I assume you mean, what I expect, and you go on to articulate: it'll be last scaled-up-pre-training-without-reasoning-training-too-relesed-publicly model.

As we observe, the value to benchmarks of, in your parlance, scaled-down pretraining, with reasoning training, is roughly the same as scaled-up pre-training without reasoning training.

1 comments

HarHarVeryFunny 481 days ago

> Yes it is. It's the first model at this scale.

Is it? Bigger than Grok 3? How do you know - just because it's expensive?

link

refulgentis 481 days ago

At some point, I have to say to myself: "I do know things."

I'm not even sure what the alternative theory would be: no one stepped up to dispute OpenAI's claim that it is, and X.ai is always eager to slap OpenAI around.

Let's say Grok is also a pretraining scale experiment. And they're scared to announce they're mogging OpenAI on inference cost because (some assertion X, which we give ourselves the charity of not having to state to make an argument).

What's your theory?

Steelmanning my guess: The price is high because OpenAI thinks they can drive people to Model A, 50x the cost of Model B.

Hmm...while publicly proclaiming, it's not worth it, even providing benchmarks that Model A gets the same scores 50x cheaper?

That doesn't seem reasonable.

link

HarHarVeryFunny 480 days ago

OpenAI have apparently said that GPT 4.5 has a knowledge cutoff date of October 2023, and their System Card for it says "GPT 4.5 is NOT a frontier model" (my emphasis).

It seems this may be an older model that they chose not to release at the time, and are only doing so now due to feeling pressure to release something after recent releases by DeepSeek, Grok, Google and Anthropic. Perhaps they did some post-training to "polish the turd" and give it the better personality that seems to be one of it's few improvements.

Hard to say why it's so expensive - because it's big and expensive to serve, or for some marketing/PR reason. It seems that many sources are confirming that the benefits of scaling up pre-training (more data, bigger model) are falling off, so maybe this is what you get when you scale up GPT 4.0 by a factor of 10x - bigger, more expensive, and not significantly better. Cost to serve could also be high because, not intending to release it, they have never put the effort in to optimize it.

link

refulgentis 480 days ago

See, you get it: if we want to know nothing, we can know nothing.

For all we know, Beezlebub Herself is holding Sam Altman's conciousness captive at the behest of Nadella. The deal is Sam has to go "innie" and jack up OpenAI costs 100x over the next year so it can go under and Microsoft can get it all for free.

Have you seen anything to disprove that? Or even casting doubt on it?

link