Hacker News new | ask | show | jobs
by bigstrat2003 101 days ago
> Also, theres the pace of advancement of the models. Many people formed their opinions last year, and the landscape has changed a lot.

People have been saying this every year for the last 3 years. It hasn't been true before, and it isn't true now. The models haven't actually gotten smarter, they still don't actually understand a thing, and they still routinely make basic syntax and logic errors. Yes, even (insert your model of choice here).

The truth is that there just isn't any juice to squeeze in this tech. There are a lot of people eagerly trying to get on board the hype train, but the tech doesn't work and there's no sign in sight that it ever will.

4 comments

Maybe I'm solving different problems to you, but I don't think I've seen a single "idiot moment" from Claude Code this entire week. I've had to massage things to get them more aligned with how I want things, but I don't recall any basic syntax or logic errors.
With the better harness in Claude code and the >4.5 model and a somewhat thought out workflow we’ve definitely arrived at a point where I find it very helpful. The less you can rely on one-shot and more give meaningful context and a well defined testable goal the better it is. It honestly does make me worry how much better can it get and will some percentage of devs become obsolete. It requires less hand holding than many people I’ve worked with and the results come out 100x faster
I saw a few (Claude Sonnet 4.6), easily fixed. The biggest difference I’ve noticed is that when you say it has screwed up it much less likely to go down a hallucination path and can be dragged back.

Having said that, I’ve changed the way I work too: more focused chunks of work with tight descriptions and sample data and it’s like having a 2nd brain.

Very good way to describe it. I am enjoying Opus a lot.
I swear some people are using some other tech than I'm using the past few months. Where I work, Claude Code is developing major changes to our very large code base (many repos, millions upon millions of lines of really important code) and pushing to prod regularly. Even the most bearish of engineers are now using it to ship important code daily. It still has issues and you have to know how to use it, but it is a shocking productivity increase (although Amdahl's Law applies for software engineering, too. Coding is only a relatively small percentage of what is done)
> I swear some people are using some other tech than I'm using the past few months.

I'm curious about this discrepancy too. I assume that you're being facetious and the discrepancy is with people's perceptions of AI’s capabilities or usefulness or whatever subjective metric. Some, myself included, seem to perceive it as basically useless, while others, yourself included, seem to imply that it's at a level where it genuinely replaces competent coders.

If the discrepancy were small, it could just be chalked up to the metric being subjective. But it seems to be like night and day. A difference of orders of magnitude. I wanna know what's going on there.

Did you lay off some engineers or keep them but make the software better?
All I know is it feels very different using it now then it did a year ago. I was struggling to get it to do anything too useful a year ago, just asking it to do a small function here or there, often not being totally satisfied with the results.

Now I can ask an agent to code a full feature and it has been handling it more often than not, often getting almost all of the way there with just a few paragraphs of description.

And yet I just eliminated three months (easily) of tech debt on our billing system in the past two weeks.