Hacker News new | ask | show | jobs
by simonw 449 days ago
"my experience is that in the context of working in a large systems codebase with years of history, it’s just not that useful."

It's possible that changed this week with Gemini 2.5 Pro, which is equivalent to Claude 3.7 Somnet in terms of code quality but has a 1 million token context (with excellent scores on long context benchmarks) and an increased output limit too.

I've been dumping hundreds of thousands of times of codebase into it and getting very impressive results.

1 comments

See this is one of the things that’s frustrating about the whole endeavor. I give it an honest go, it’s not very good, but I’m constantly exhorted to try again because maybe now that Model X 7.5qrz has been released, it’ll be really different this time!

It’s exhausting. At this point I’m mostly just waiting for it to stabilize and plateau, at which point it’ll feel more worth the effort to figure out whether it’s now finally useful for me.

Not going to disagree that it's exhausting! I've been trying to stay on top of new developments for the past 2.5 years and there are so many days when I'll joke "oh, great, it's another two new models day".

Just on Tuesday this week we got the first widely available high quality multi-modal image output model (GPT-4o images) and a new best-overall model (Gemini 2.5) within hours of each other. https://simonwillison.net/2025/Mar/25/