Hacker News new | ask | show | jobs
by evrenesat 95 days ago
Qwen3 Coder Next and Qwen3.5-35B-A3B already very good and can be run on today's higher end home computers with good speed. Tomorrow's machines will not be slower but models are keep getting more efficient. A good sw engineer still would be valuable in Tomorrow's world but not as a software assembler.
3 comments

Even cutting edge models are not very good. They are not even on mediocre level. Don’t get me wrong, they are improving, and they are awesome, but they are nowhere near good yet. Vibe coded projects have more bugs than features, their architecture and design system are terrible, and their tests are completely useless about half the time. If you want a good product you need to rewrite almost everything what’s written by LLMs. Probably this won’t be the case in a few years, but now even “very good” LLMs are not very good at all.
Not sure why you're being downvoted, this is very much my experience. When it matters (like, customer data is on the line) vibecoded projects are not just hilariously bad, but put you in legal danger.

We've so far found that Claude code is fine as a kind of better Coverity for uncovering memory leaks and similar. You have to check its work very carefully because about 1 time in 5 it just gets stuff wrong. It's great that it gets stuff right 4 times in 5 and produces natural code that fits into the style of the existing project, but it's nothing earth-shattering. We've had tools to detect memory leaks before.

We had someone attempt to translate one of our existing projects into Rust and the result was just wrong at a fundamental level. It did compile and pass its own tests, so if you had no idea about the problem space you might even have accepted its work.

With Claude Code now having a /plan mode - you can take your time and deliberate through architecture and design, collaboratively, instead of just sending a fire-and-forget. Much less buggy and saves time if you keep an eye on the output as you go, guiding it and catching defects, imho.
For that you need to create something which you know exactly how you want to code, or what architecture is needed. In other words, you would win basically nothing, because typing was never the real bottleneck (no matter what VIM and Emacs people would tell you).

LLMs also make mistakes even way lower level than those one pagers allow you to control with the planning mode. Which I use all the time btw. And anyway, they throw the plan out of the window immediately when their tried solutions don't work during execution, for example when a generated test is failing.

Btw, changing the plan after its generation is painful. It happens more than not that when I decline it with comments it generates a worse version of it, because it either miss things from the previous one which I never mentioned, or changes the architecture to a worse one completely. In my experience, it's better to restart the whole thing with a more precise prompt.

Ah, this is true - for my purposes, I've been directing the design and deliberating on the constraints and specifications for a larger system in tandem with smaller planning sessions.

That has worked well so far, but yes, you are totally right, there are still quite a few pain points and it is still rather far from being fire-and-forget "build me a fancy landing page for a turnkey business" and getting enterprise quality code.

edit: I think it is most important that you collaborate with Claude Code on quality in a systematic way, but even that has limits, right now - 1M context changes things a little bit.

You know, with all the babysitting needed, I wonder if effort is not better spent in just, you know, writing code.

Can you actually quantify the time & effort 'saved' letting LLM generate code for you?

For me, personally, I'm building things that would have been impractical for me to do as cleanly within the same amount of time - prototypes in languages that I don't have the muscle memory for, using algorithms i have a surface level understanding of but would need time to deeply understand and implement by hand, and, at my pace, as a retired dev, is probably quantified in terms of years worth of time and effort saved.

edit: also, would I take the time to implement LCARS by hand? No. But with an LLM, sure, took it about 3 minutes or less to implement a pretty decent LCARS interface for me.

> Tomorrow's machines will not be slower

The way it's going, the AI hyperscalers are buying such a big portion of the world's hardware, that it may very well happen that tomorrow's machines do get slower per dollar of purchase value.

Not my experience. Current Qwen Coder is noteworthy but still far from good. Can't compare them with current commercial offerings, it is just different leagues.