| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bangaroo 246 days ago

every time i say "the tech seems to be stagnating" or "this model seems worse" based on my observations i get this response. "well, it's better for other use cases." i have even heard people say "this is worse for the things i use it for, but i know it's better for things i don't use it for."

i have yet to hear anyone seriously explain to me a single real-world thing that GPT5 is better at with any sort of evidence (or even anecdote!) i've seen benchmarks! but i cannot point to a single person who seems to think that they are accomplishing real-world tasks with GPT5 better than they were with GPT4.

the few cases i have heard that venture near that ask may be moderately intriguing, but don't seem to justify the overall cost of building and running the model, even if there have been marginal or perhaps even impressive leaps in very narrow use cases. one of the core features of LLMs is they are allegedly general-purpose. i don't know that i really believe a company is worth billions if they take their flagship product that can write sentences, generate a plan, follow instructions and do math and they are constantly making it moderately better at writing sentences, or following instructions, or coming up with a plan and it consequently forgets how to do math, or becomes belligerent, or sycophantic, or what have you.

to me, as a user with a broad range of use cases (internet search, text manipulation, deep research, writing code) i haven't seen many meaningful increases in quality of task execution in a very, very long time. this tracks with my understanding of transformer models, as they don't work in a way that suggests to me that they COULD be good at executing tasks. this is why i'm always so skeptical of people saying "the big breakthrough is coming." transformer models seem self-limiting by merit of how they are designed. there are features of thought they simply lack, and while i accept there's probably nobody who fully understands how they work, i also think at this point we can safely say there is no superintelligence in there to eke out and we're at the margins of their performance.

the entire pitch behind GPT and OpenAI in general is that these are broadly applicable, dare-i-say near-AGI models that can be used by every human as an assistant to solve all their problems and can be prompted with simple, natural language english. if they can only be good at a few things at a time and require extensive prompt engineering to bully into consistent behavior, we've just created a non-deterministic programming language, a thing precisely nobody wants.

3 comments

48terry 246 days ago

The simple explanation for all this, along with the milquetoast replies kasey_junk gave you, is that to its acolytes, AI and LLMs cannot fail, only be failed.

If it doesn't seem to work very well, it's because you're obviously prompting it wrong.

If it doesn't boost your productivity, either you're the problem yourself, or, again, you're obviously using it wrong.

If progress in LLMs seems to be stagnating, you're obviously not part of the use cases where progress is booming.

When you have presupposed that LLMs and this particular AI boom is definitely the future, all comments to the contrary are by definition incorrect. If you treat it as a given that this AI boom will succeed (by some vague metric of "success") and conquer the world, skepticism is basically a moral failing and anti-progress.

The exciting part about this belief system is how little you actually have to point to hard numbers and, indeed, rely on faith. You can just entirely vibe it. It FEELS better and more powerful to you, your spins on the LLM slot machine FEEL smarter and more usable, it FEELS like you're getting more done. It doesn't matter if those things are actually true over the long run, it's about the feels. If someone isn't sharing your vibes about the LLM slot machine, that's entirely their fault and problem.