| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by olao99 84 days ago

For what is worth I find GPT 5.5 qualitatively different than 5.4 and 5.3

If I had to collapse the nature of the difference in one sentence it'd be that the 5.5 does more what I'm asking it to do versus doing a small aspect of what I'm asking then stopping.

5.4 required a lot of "continue" encouragement. 5.5 just "gets it" a bit more

What is boils down to for me is that even though it's more expensive I would much rather use 5.5 on low then 5.4/5.3 on high/medium

3 comments

darqis 83 days ago

5.5 is overcomplicating it. Where the solution is e.g. changing some oidc auth url, it goes around and verifies and check and builds this and that to eventually change the url, and then write a summary.

It is unable to do K.I.S.S . Instead of adding just an endpoint, it creates a service, middleware, config reader and finally an endpoint.

LLMs are nowhere near being good developers. The only thing they have is speed. Because of this speed they create the illusion of a good developer, the whoa moment. Whoa it would've taken me 2 months to implement this. Yeah but then again you would not make such silly mistakes and you would've reused that oidc client instead of reinventing the wheel every single time.

link

varispeed 84 days ago

They must have changed something recently as when 5.5 first dropped I was unable to make it do anything. It would say it will implement, but would never actually do it, no matter how many times I tried to tell it what it needs to do. It would acknowledge what needs to be done, even create step by step plan and then ask if it should do it. I would confirm and then it will just go around reiterating the plan and that this time it will start. Annoying and funny. Now it doesn't seem to be doing that anymore.

link

Wingy 84 days ago

I think that's a failure mode of using the legacy completions API rather than the new responses API. With the responses API, the agent actually goes and does the things it's supposed to do.

link

nothinkjustai 84 days ago

They probably just tell it to do more in the prompt lmao

link