Hacker News new | ask | show | jobs
by nougati 259 days ago
I'm surprised at this, LLMs have had many developments since Gpt3.5, technologically and culturally. What kind of developments would be impressive to you?
2 comments

This is a common sentiment from my peers who have not spent any real time with the frontier models in the last six months.

They tend to poke the free ChatGPT for ill defined requests and come away disappointed.

Same experience here, using new models. Every time it's a disappointment. Useful for search queries that are not too specialized. That's it.
I get pretty good results with Claude code, Codex, and to a lesser extend Jules. It can navigate a large codebase and get me started on a feature in a part of the code I'm not familiar with, and do a pretty good job of summarizing complex modules. With very specific prompts it can write simple features well.

The nice part is I can spend an hour or so writing specs, start 3 or 4 tasks, and come back later to review the result. It's hard to be totally objective about how much time it saves me, but generally feels worth the 200/month.

One thing I'm not impressed by is the ability to review code changes, that's been mostly a waste of time, regardless of how good the prompt is.

Company expectations are higher too. Many companies expect 10x output now due to AI, but the technology has been growing so quick that there are a lot of people/companies who haven't realized that we're in the middle of a paradigm shift.

If you're not using AI for 60-70 percent of your code, you are behind. And yes 200 per month for AI is required.

We've been trialing code rabbit at work for code review. I have various nits to pick but it feels like a good addition.
maybe if openai let me generate an image through api? that would impress me. instead, they took away temperature and gave us verbosity and reasoning effort to think about every time we make an api call.
Then you should be very impressed, because they let you generate videos by API: https://platform.openai.com/docs/models/sora-2

That's a low bar.