Hacker News new | ask | show | jobs
by HAL3000 296 days ago
> gpt5 has always been about making a "collection of models" work together and not about model++.

No, it wasn’t. Have you read and listened to Altman’s hype around GPT-5 from a year ago? They changed the narration after the 4.1 flop, which they thought would be GPT-5, and it seems some people fell for it.

> Capabilities ~90-110% of their top tier old models at 4-6x lower price

Maybe they finally implemented the DeepSeek paper.

2 comments

This is Altman before the release:

OpenAI's CEO says he's scared of GPT-5

https://www.techradar.com/ai-platforms-assistants/chatgpt/op...

Sam Altman Compares OpenAI To The Manhattan Project—And He's Not Joking About the Risks

https://finance.yahoo.com/news/sam-altman-compares-openai-ma...

This is Altman after the release:

Sam Altman says ‘yes,’ AI is in a bubble

https://www.theverge.com/ai-artificial-intelligence/759965/s...

> No, it wasn’t.

I replied below in this thread with the specific post, 6 months ago.

> After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks.

> In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model.

"the delta between 5 and 4 will be the same as between 4 and 3"[1]

Obviously it's not.

1. https://lexfridman.com/sam-altman-2-transcript/

GPT-4 was a long time ago, and honestly mostly useless. But a lot of that progress was already present in the intervening models, and it's easy to forget it happened when comparing GPT-5 to the state of the art a month ago rather than two years ago.

This is hard to quantify exactly since very few benchmarks have the kind of scales where comparing two deltas would be meaningful. But if we pick the Artifical Analysis composite score[0] as the baseline, GPT-3.5 Turbo was at 11, GPT-4 at 25, and GPT-5 at 69. It's just that most of the post-GPT-4 improvement was with o1 and o3.

Feels like a pretty fair statement.

[0] https://artificialanalysis.ai/#frontier-language-model-intel...