|
|
|
|
|
by CuriouslyC
245 days ago
|
|
GPT5 may have been underwhelming to _you_. Understand that they're heavily RLing to raise the floor on these models, so they might not be magically smarter across the board, there are a LOT of areas where they're a lot better that you've probably missed because they're not your use case. |
|
i have yet to hear anyone seriously explain to me a single real-world thing that GPT5 is better at with any sort of evidence (or even anecdote!) i've seen benchmarks! but i cannot point to a single person who seems to think that they are accomplishing real-world tasks with GPT5 better than they were with GPT4.
the few cases i have heard that venture near that ask may be moderately intriguing, but don't seem to justify the overall cost of building and running the model, even if there have been marginal or perhaps even impressive leaps in very narrow use cases. one of the core features of LLMs is they are allegedly general-purpose. i don't know that i really believe a company is worth billions if they take their flagship product that can write sentences, generate a plan, follow instructions and do math and they are constantly making it moderately better at writing sentences, or following instructions, or coming up with a plan and it consequently forgets how to do math, or becomes belligerent, or sycophantic, or what have you.
to me, as a user with a broad range of use cases (internet search, text manipulation, deep research, writing code) i haven't seen many meaningful increases in quality of task execution in a very, very long time. this tracks with my understanding of transformer models, as they don't work in a way that suggests to me that they COULD be good at executing tasks. this is why i'm always so skeptical of people saying "the big breakthrough is coming." transformer models seem self-limiting by merit of how they are designed. there are features of thought they simply lack, and while i accept there's probably nobody who fully understands how they work, i also think at this point we can safely say there is no superintelligence in there to eke out and we're at the margins of their performance.
the entire pitch behind GPT and OpenAI in general is that these are broadly applicable, dare-i-say near-AGI models that can be used by every human as an assistant to solve all their problems and can be prompted with simple, natural language english. if they can only be good at a few things at a time and require extensive prompt engineering to bully into consistent behavior, we've just created a non-deterministic programming language, a thing precisely nobody wants.