Hacker News new | ask | show | jobs
by tsunego 479 days ago
GPT-4.5 feels like OpenAI's way of discovering just how much we'll pay for diminishing returns.

The leap from GPT-4o to 4.5 isn't a leap—it's an expensive tiptoe toward incremental improvements, priced like a luxury item without the luxury payoff.

With pricing at 15x GPT-4o, they're practically daring us not to use it. Given this, I wouldn't be surprised if GPT-4.5 quietly disappears from the API once OpenAI finishes squeezing insights (and cash) out of this experiment.

5 comments

Even this is a bit overly complicated/optimistic to me. Why not something as simple as: OpenAI has been building larger and larger models to great success for a long time. As a result, they were excited this one was going to be so much larger=so much better that the price to run it would be well worth the huge jump they were planning to get from it. What really happened is this method of scaling hit a wall and they were left with an expensive dud they won't get much out of but they have to release something for now otherwise they start falling well behind on the boards the next few months. Meanwhile they scramble focus to find other means of scaling like "chain of thought + runtime" provided.
Thank you so much for this comment. I don't really understand the need for people to go straight to semi-conspiratorial hypotheses, when the simpler explanation makes so much more sense. All the evidence is that this model is much larger than previous ones, so they must charge a lot more for inference because it costs so much more to run. OpenAI were the OGs when it came to scaling, so it's not surprising they went this route and eventually hit a wall.

I don't at all blame OpenAI for going down this path (indeed, I laud them for making expensive bets), but I do blame all the quote-un-quote "thought leaders" who were writing breathless posts about how AGI was just around the corner because things would just scale linearly forever. It was classic "based on historical data, this 10 year old will be 20 feet tall by the time he's 30" thinking, and lots of people called them out on this, and they either just ignored it or responded with "oh, simple not-in-the-know peons" dismissiveness.

It is weird because this is a board for working programmers for the most part. So like, who’s seen a gram conspiracy actually be accomplished? Probably now many. A lackluster product that gets released even though it sucks because too many people are highly motivated not to notice that it sucks? Everybody has experienced that, right?
Exactly. Although I wouldn't even say they have blinders, it seems like OpenAI understands quite well what 4.5 can do and what it can't hence the modesty in their messaging.

To your point, though, I would add not only who has seen any grand conspiracy actually be accomplished, who has seen one even attempted and kept under wraps? Such that the absence of corroborating sources was more consistent with an effectively executed conspiracy theory than the simple absence of such a plan.

It works until it doesn't and hindsight is 20/20.
> It works until it doesn't

Of course, that's my point. Again, I think it's great that OpenAI swung for the fences. My beef is again with these "thought leaders" who would write this blather about AGI being just around the corner in the most uncritical manner possible (e.g. https://news.ycombinator.com/item?id=40576324). These folks tended to be in one of two buckets:

1. "AGI cultists" as I called them, the "we're entering a new phase of human evolution"-type people.

2. People who had a motive to try and sell something.

And it's not about one side or the other being "right" or "wrong" after the fact, it's that so much of this just sounded like magical thinking and unwarranted extrapolations from the get go. The actual experts in the area, if they were free to be honest, were much, much more cautious in their pronouncements.

Definitely, the grifters and hypesters are always spoiling things, but even with a sober look it felt like AGI _could_ be around the corner. All these novel and somewhat unexpected emerging capabilities as we pushed more data through training, you'd think maybe that's enough? It wasn't and test time compute alone isn't either, but that's also hindsight to a degree.

Either way, AGI or not, LLMs are pretty magical.

If you've been around long enough to witness a previous hype bubble (and we've literally just come out of the crypto bubble), you should really know better by now. Pets.com, literally an online shop selling pet food, almost IPOd for $300M in early 2000, just before the whole dot-com bubble burst.

And yeah, LLMs are awesome. But you can't predict scientific discovery, and all future AI capabilities are literally still a research project.

I've had this on my HN user page since 2017, and it's just as true as ever: In the real world, exponentials are actually early stage sigmoids, or even gaussians.

Well that's only because YOU don't understand exponential growth! No human can! /s
In fundamental science terms, it also proves once and for all that more model doesn't mean more better. Any forces within OpenAI pushing to move past just growing the model for gains now have a strong argument for going all-in on new processes.
Time to enter the tick cycle.

I ask chatgpt to give me a map highlighting all spanish speaking countries, gives me stable diffusion trash.

Just gotta do the grunt work, add a tool with a map api. Integrate with google maps for transit stuff.

It's a good LLM model already it doesn't need to be einstein and solve aerospatial equations. We just need to wait until they realize their limits and find the humility to build yet another useful product that won't conquer the world.

I’ve thought of LLM’s as google 2.0 for some time now. Truly a world changing technology similar to how google changed the world, likely to have an even larger impact than google had as we create highly specialized Implementations of the technology in the coming decade…but it’s not energy positive nuclear fusion, or a polynomial time NP solver, it’s just google 2.0
Google 2.0 where you have to check every answer it gives you because it's authoritative about nothing.

Works great when the output is small enough to unit test or immediately try in situations with no possible negative outcomes.

Anything larger? Skip the LLM slop and go to the source. You have to go to the source, anyway.

All while using far more energy than a normal google search
I keep wondering what the long-game (if any) of LLMs is... to make the world dependent on various models then jack the rates up to cover the costs? The gravy-train of SV funding has to end eventually... right?
You have to go to the source, anyway.

Yeah, and then check that. I don't get this argument at all.

People who uncritically swallow the first answer or two they get from Google have a name... but that would just derail the thread into politics.

There is a truth in the grandparent's comment that doesn't necessarily conflict with this view. The Google 2.0 effect is not necessarily that it gives you a better correct answer faster than google. I think it never dawned on people how bad they were at searching about topics they didn't know much about or how bad google was at pointing them in the right direction prior to chatgpt. Or putting it another way, they never realized how much utility they would get out of something that pointed them in the correct direction even though they couldn't trust the details.

It turns out that going from not knowing what you don't know to knowing what you don't know adds an order of magnitude improvement to people's experience.

And the llm by design does not save or provide source. Unlike google or wikipedia which are transparent about sources.
It most certainly does, if you are using the latest models, which people making comments like this never are as a rule.
There is something to be said for trusting people’s (or systems of people’s) authority.

For example, have you ever personally verified that humans went to the moon? Have you ever done the experiments to prove the Earth is round?

> Have you ever done the experiments to prove the Earth is round?

I have, actually! Thanks, astronomy class!

I've even estimated the earth's diameter, and I was only like 30% off (iirc). Pretty good for the simplistic method and rough measurements we used.

Sometimes authorities are actually authoritative, though, particularly for technical, factual material. If I'm reading a published release date for a video game, directly from the publisher -- what is there to contest? Meanwhile, ask an LLM and you may have... mixed results, even if the date is within its knowledge cutoff.

This is not a helpful phrasing I think. Sources allow the reader to go as far down the rabbit hole as they are willing to or knowledgable enough to go.

For example, if I'm looking for some medical finding and I get to a source that's a clinical study from a reputable publication, I may be satisfied and stop there since this is not my area of expertise. However, a person with knowledge of the field may be able to parse the study and pick it apart better than I could. Hence, their search would not end there since they would be unsatisfied with just the source I was satisfied with.

On the other hand, having no verifiable sources should leave everyone unsatisfied.

Have you provided documentation that you are human? Perhaps you are a lizard person sowing misinformation to firm up dominance of humankind.
I asked claude to give me a script in python to create a map highlighting all spanish speaking countries. it took 3 tries and then gave me a perfect svg and png.
Interesting, I don't use claude. Could you provide us with a link of how you got the LLM to produce the map and how it looks?

I'm getting this. https://claude.ai/share/7a8ecdb0-a28c-4d48-ad81-2d9e95fab538

LLMs could make some nice little tools.

However they’ll need to replace vast swathes of the economy to justify these AI companies’ market caps.

> Just gotta do the grunt work, add a tool with a map api. Integrate with google maps for transit stuff.

This is kind of the crux though. The only way to make LLMs more useful is to basically make them traditional AI. So it's not really a leap forward nevermind path to AGI.

Giving ChatGPT stupid AI image generation was a huge nerf. I get frustrated with this all the time.
Oh, I think it's great they did that. It's super helpful for visualizing ChatGPT's limitations. Ask it for an absolutely full, overflowing glass of wine or a wrist watch whose time is 6:30 and it's obvious what it actually does. It's educational.
They should have called it "ChatGPT Enterprise".
Exactly! designed specifically for people who love burning corporate budgets.
OpenAI is going to add it to Plus subscriptions. I.e. available for many at no additional cost. Likely with restrictions line N prompts/hour.

As for API price, when it matters businesses and people are willing to pay much more for just a bit better results. OpenAI doesn't take the other options away. So we don't lose anything.

IMO the 4o output is lot more Enterprise-compatible, the 4.5 being straight to the point and more natural is quite the opposite. Pricing-wise your point stands.

Disclaimer: have not tried 4.5 yet, just skimmed through the announcement, using 4o regularly.

Apparently, OpenAI API “credits” expire after a year. I stupidly put another $20 and trying to blow through them, 4.5 is the easiest way considering recent 4o has fallen out of favor for other models and I don’t want to just let them expire again. An expiry after only one year is asinine.
I worked at a company that expired credits after 365 days. My layman understanding is the credits sit as a liability until consumed, and the alternative is having potentially millions accounts with a combined hundreds of millions of dollars in 'liabilities' on their books.

More info from ChatGPT here: https://chatgpt.com/share/67c5ad36-2f84-8001-a61f-11d4e17135...

Yes. I also discovered this, and was also forced to blow through my credits in a rush. Terrible policy.
I'm learning this for the first time now. I don't appreciate having to anticipate how many credits I'll use like its an FSA account.
>Terrible policy.

And unfortunately one not exclusive to OpenAI. Anthropic credits also expire after 1 year.

Not sure how it's with OpenAI, but Anthropic is so money-hungry, they won't even let you remove your debit card data from your account without a week-long support encounter.
This is how pricing on human labour works. Nobody expects an employee that costs twice as much to produce twice the output for any given task. All that is expected is that they can do a narrow set of things, that another person can't.