Hacker News new | ask | show | jobs
by gymbeaux 21 days ago
Even though the synthetic benchmarks paint a picture of LLMs coming a long way since 2022, my practical experience has been that they aren’t tangibly better. No doubt someone reading this will chime in and say LLMs are way better at writing code or whatever, and maybe that’s true, but there’s no difference between ChatGPT 3.5 and Claude Opus 4.8 as far as my trusting the output. Opus 4.8 still messes up plenty. It’s particularly bad with identifying and fixing CI yaml, but it struggles in the usual areas too.

So I’m thinking we’ve just about reached apex with LLMs, and they have failed at replacing software engineers (companies can freeze hiring juniors at their own, future peril using any excuse they like).

2 comments

Yep, that has been my experience as well. There hasn't been any meaningful improvement in LLMs since ChatGPT first launched. They still fall over, in the same ways, and with more or less the same high rates.
The difference for me has been that you can use llm as your main typing interface, you couldn't do that (without being annoying) before I think opus 4.5
i've heard of principle level engineers saying their typing is bad now.

skill atrophy is real. especially for the stupid.

I'm typing to talk to the LLM, so definitely not lost.

The main atrophy I'm concerned about is keeping in mind the state of a massive piece of code executing in my mind.

Now with LLM I just ask "show me the state by putting comments in the code" and I just read it