|
|
|
|
|
by gymbeaux
21 days ago
|
|
Even though the synthetic benchmarks paint a picture of LLMs coming a long way since 2022, my practical experience has been that they aren’t tangibly better. No doubt someone reading this will chime in and say LLMs are way better at writing code or whatever, and maybe that’s true, but there’s no difference between ChatGPT 3.5 and Claude Opus 4.8 as far as my trusting the output. Opus 4.8 still messes up plenty. It’s particularly bad with identifying and fixing CI yaml, but it struggles in the usual areas too. So I’m thinking we’ve just about reached apex with LLMs, and they have failed at replacing software engineers (companies can freeze hiring juniors at their own, future peril using any excuse they like). |
|