Hacker News new | ask | show | jobs
by pizlonator 7 days ago
What I can’t get over is that there have been exactly zero software breakthroughs since vibe coding started, other than vibe coding itself.

Claude is amazing, that’s true.

But if it was as amazing as this article implies, I’d expect some breakthrough outside of AI itself.

Rewriting a Zig program in unsafe Rust? Not a breakthrough. Finding a bunch of security vulns? Maybe that’s sort of a breakthrough though it’s underwhelming and possibly just a net negative. But like if I rolled back to using software from 2023 then life would be ok.

Maybe we just need to give it time, and sometime real soon, we will all be amazed by such a breakthrough? Who knows

14 comments

Maybe my bar for what constitutes a breakthrough is lower than other people's, but all of these seem like breakthroughs to me:

NLP as a field saw huge shifts. NLP tasks that used to be complex and inaccurate can now be setup very easily and quickly using structured outputs from LLMs, often with greater accuracy.

A small charity I help with has now been able to build their own website to manage their day-to-day operations. It saves them a lot of time, and it was vibe-coded using Manus. I don't think people appreciate how much room there is left for bespoke software to have big impacts on small organisations that can't afford to hire developers. The cost for software like the one they made has gone from 10s of thousands of dollars to $10/month and volunteer hours.

My brother has recently been setting up Cowork to do an automatic review of contracts before human review, and he said it is far more diligent than people when it comes to routine things to check. This is another huge breakthrough for not just efficiency, but the quality of work.

I really don't think we can discount AI finding bugs and vulnerabilities. If you care about code quality and keep up review standard, LLMs can help you write more robust software. AI has found a huge number of bugs for me before they hit production, including potential out-of-bounds memory accesses and segfaults.

ChatGPT has 1 billion MAU. People are now getting life advice, financial advice, and mental health help from chatbots at a scale and cost that no human support network could match.

> ChatGPT has 1 billion MAU. People are now getting life advice, financial advice, and mental health help from chatbots

Personally not the kind of breakthrough I'm psyched about

Yeah, the thing that worries me is that an LLM can be guided to agree with any premise and will rarely ever take a hard stance.
…which is why it’s led to more than zero suicides.
There are many known cases of it saving lives.

Also, they have done a good job shutting down the psychotic behavior you could get from 4o era models. If there are remaining issues like that they ought to fix them too.

Well, you're not twisting yourself into knots to identify breakthroughs. Try harder!
> ChatGPT has 1 billion MAU. People are now getting life advice, financial advice, and mental health help from chatbots at a scale and cost that no human support network could match.

That's terrifying.

You realize that's terrifying, right?

Definitely, it is quite an extreme change. But the upsides of better access to support and advice are huge, even if the potential downsides are scary as well. This feels like one area where we need better transparency and regulation due to how much ChatGPT and others can affect people who listen to them.
Its in a weird space right now.

These models are actually extremely good but they are far from an intelligence unto themselves. Truth is if someone told you they could build these things 5 years ago, you d write them a check for a trillion dollars. Problem is once we got them, we realized they are not all that. Its like a mecha suit in a universe, where mecha suits are abundant and cheap. Someone has to climb into them everyday and put in the work for it to be effective.

So now the skeptics are saying this technology is overrated. And the optimists are accusing the skeptics of moving goal posts.

I think we are learning in real-time what intelligence re. humans is as we go along.

Humans only what they know, until they acquire more information about what's possible.

The goal post narrative is stupid to begin with.

Humans have goal seeking behavior. LLMs don’t. You could maybe call the combination of LLMs and the RL-based harnesses somewhat “intelligent” in aggregate, but the problem is that it’s not “general” intelligence like these labs want to argue, since it’s by definition only good for the set of problems the RL part has been trained to solve, which is a subset of programming problems.
> Problem is once we got them, we realized they are not all that.

Isn't this just the hype cycle? [1]

Fake edit: I know its not a perfect model.

1: https://www.gartner.com/en/research/methodologies/gartner-hy...

The problem is what they can do is rapidly expanding. Software development is becoming increasingly hands off.

If they get to the point where they're smart enough to make tasteful code decisions based on stakeholder input... we're cooked as a profession.

Most of the skeptics exist because of the grandiose claims made by the AI companies saying pure hype marketing bs. If this was just a tool, discussed at the scope of what the tools can actually produce and do, there would be sensible discourse about it.
I am doing a solo project that is pretty big, meaning it is not something I could vibe code. I can do alot with AI that I could never do on my own, but I am not seeing several mulitples improvement in my productivity. I spend so much time doing what I call "AI wrangling", trying to get it to do what I want. Claude is writing all the javscript and python code, but ultimately I am programming in English. What is good is that it is effectively a very high level computer language, where the agent can implement a lot of underlying code with a short English description, often. But many other times it takes a lot of work to get what you want.
I measured an ~8x increase in the number of commits I've been pushing, and I've actually been trying to restrain myself. I could do a lot more if I stopped reviewing and editing the code. I think it's got more to do with my executive ability than raw productivity though. AI essentially cured my ADHD by making the execution of my ideas virtually painless.
LOL "I measured an 8x increase in the number of commits Ive been pushing" is an absolutely useless statement
Subscribed to Claude a few months ago. I immediately started working with it on my programming language. Since then, I've implemented a compacting garbage collector, a size class based memory allocator, a unified value heap, deeply optimized hash tables and even implemented shapes like V8 and Self, redesigned the value representation, created a Common Lisp style condition system, implemented UTF-8 text decoding, refined the generators API, increased the number of tests from ~200 to ~1200 and improved the test suite to the point it runs all of those tests in parallel in under two seconds, implemented stack protection support, added an aarch64 matrix to the GitHub CI, fixed a zillion bugs, improved performance, perfected tail call optimization. I did so much stuff I'm probably forgetting some. And these aren't "lol just do it" prompts either, I'm putting effort into refining design and implementation. I review every line. Just finished designing safe hash table iteration in spite of mutability: generation counters that get bumped whenever the table is reallocated. It's actually gonna be more powerful than what other languages do. Next up on my todo list is to implement my language's unified pattern matcher, static allocation for all interpreter internal data in order to get rid of all initialization code and achieve nearly zero startup time, and then finally a bytecode interpreter to close the performance gap on the likes of Python.

Dramatically improved my static site generator Pugneum to the point it's better than markdown and added Atom and RSS feeds, used it to write several articles about my language. Pace is so fast I actually need to write those articles by hand in order to crystalize the knowledge I learned. If I don't I'm afraid I'll just forget everything. No LLMs for the articles themselves, but they sure as hell took all the pain away from writing them. Pugneum even has back references and table of contents generation now. Claude even helped me refine my website's CSS, something I'm not very good at.

Also created my own invoicing system for $DAYJOB so I can invoice companies from my terminal. Started a decompilation project for my cherished childhood games and I've already almost finished decompiling one game's engine after just a few days. Been working on my cyberdeck project too, this one's a bit slow because I got to the point where I'll actually need to spend money on it to move forward. All this inside the rootless development virtual machine system built on top of QEMU and systemd that I developed together with Claude, whose network isolation I'm currently hardening. Started reverse engineering my laptop again! And I'm actually making progress! Made a color scheme app for the keyboard LEDs controller I made many years ago, with loads and loads of color schemes! Found some kind of bug in my keyboard while doing it, in less than an hour I had the root cause and a fix applied locally, sent the fix to systemd, it got merged. Planning to ramp up my free and open source software participation as well now that exploring codebases is a breeze. Already have some mesa patches ready for upstream. Have been playing with strace since I use it so much.

Better?

I’m sure rapor99 is unimpressed while not being able to point to any similar accomplishments in their own work in the same timeframe.
I'm building a memory safe programming language with a declarative concurrency model that's close to release.

There is ZERO chance I would ever be able to complete it on my own.

I doubt it'll get traction, but if it doesn't, I am pretty confident a future language will take the ideas for polymorphic synchronization and profile-guided optimization.

It has an easy version/mode of compilation that makes Rust's affine ownership accessible like a high-level scripting language, and it can progressively become more strict, where the compiler does ~99% of the work for you, and you just pick options as it finds issues (that it explains to you like you're 5) along the way.

Along the way, I also built a suite of tools that helps identify complexity better than anything I've seen (which was necessary to get the LLMs to be able to unslop themselves and write something that actually works).

I doubt the Ruby community shrugs it off, but time will tell.

How do you know it’s actually memory safe?
I have ~5500 memory safety fuzz tests, four different test suites with between ~80%-99% line/branch coverage each, and the same design as Rust, and haven't found a memory safety issue in 4 weeks, and I'm still planning another ~4 weeks of testing before release, more if need be.

Rust had memory safety bugs well after release - IIUC all the way until after the 1.0 release.

So, it's highly unlikely to be perfect, but I think it'll be in better shape than Go or Rust were when they initially launched.

I have the same experience, though I feel myself getting better at wrangling over the past few months
I spent years in the early 2000s trying to get a computer to read unstructured PDFs and TIFF images (mainly invoices, either scanned or electronic). Limited success, we always had to get a human to look at them in the end.

We implemented that in about three days earlier this year, just by feeding the files to LLMs. And it's good enough to not need a human to check.

I get that this isn't a "Computer Science breakthrough" in the sense you mean, but it used to involve a lot of hard CS to try and solve, and now it doesn't.

Maybe I'm looking through rose colored glasses, but software that writes itself seems like a pretty big breakthrough to me.
That goes straight to my point: then why hasn’t the miracle of automated coding led to breakthroughs outside of automated coding?

If the only breakthrough is automated coding with no outside consequence then it’s just masturbation

Probably because AI coding has only worked at all for a couple years and has only gotten good in like the last year?

The rate of improvement has been fast. Maybe it’ll plateau soon, or maybe we’ll have LLMs improving themselves rapidly. At this point it’s too early to say.

I don’t remember where I heard it, but there’s a saying that people overestimate how much can be accomplished in a year and underestimate how much can be accomplished in 10 years.

If we get to 2030 and still people are wondering where the breakthrough is, then I think I’d be agreeing with your skepticism. But I just think it’s too early to judge that yet.

Yeah, this is a good point.

But the clock is ticking.

on what? Who the fuck would go full transparency of what's in their black box in this hostile culture of AI hatred? None of us can put a number on what code we've used in our services that was written by humans and long may it last.
They literally can’t go full-transparency. I know a high-level insider, and the fact is that even the folks implementing things don’t actually know how it works, only that it does, and how to get it to generally behave.
N=1, but Claude etc. have made a huge difference to my life personally.

Built a bunch of software tools to streamline my small ecommerce business - while also running it - and things have turned around from "losing money and ready to pull the plug" to "looking at our best financial year on record" in the span of about 8 months.

I could imagine it wouldn't make a huge difference to the life of someone deeply entrenched in a traditional tech role, trying to get an extra 9 of reliability in a service or roll out a new carefully planned and QA'd feature.

But for tech-adjacent people, it gives us something "good enough", instantly, and basically for free.

That doesn't include the other things I've got it to do (gave Claude SSH access and got it to successfully debug a hang on my Ubuntu server, chucked Codex in a folder full of financial data and got it to find every piece of misclassified payroll transaction data)

Genuinely the biggest breakthrough for "casual" tech users since Excel.

The joke used to be “be nice or I’ll replace you with a small shell script” - Claude lets you actually get those scripts written which often aren’t replacing anyone but are automating away part of the daily hassles.
What would qualify as a breakthrough for you?
What is your bar even? automated coding has changed the game already.
Strictly speaking, it's modifying itself. Although it would be an interesting challenge - can an llm create a new llm from scratch?
No, it probably can't during our lifetime at least—but it can sure modify itself to avoid antivirus detection, which is _just swell_.
why do you think so? they provide some evidence of this in the article, but there have been several improvements in e.g. nanogpt-speedrun or openai parameter golf made by AIs
Which is funny because people have been using LISP for that since 1960.
Which is what makes putting an LLM inside a lisp so much fun
It's pretty crazy that a company like Anthropic no longer needs to hire Software Engineers, because their software engineers itself. If that's not a break through I don't know what is!

edit: it looks like I was wrong and they're still hiring many software engineers. Not completely sure why that is just yet.

The arguments against AI assisted coding used to be "only for toy projects", then at some point it became "no dignity", "joyless". Now it's "no new breakthrough" apparently. All in the span of maybe a year. I say it's made tremendous progress.
Then where is the big new non toy project created since vibe coding became a thing, that couldn’t have been created without ai?
don’t know if this qualifies as big in your book, but there are some well marketed advances here:

https://deepmind.google/blog/alphaevolve-impact/

Openclaw
I make one (small) almost every day. Admittedly the reason that couldn’t be done is because it would take time that I don’t have but 1000% every day something is written by AI that I use that would not exist if AI didn’t exist.

I don’t publish them - but they’re put into use in production and they provide a tangible benefit that would not exist otherwise.

I do this too.

I especially love how making a nicely styled website these days is a matter of describing what it looks like and waiting 10-15 minutes. There are other examples

But the OP is claiming 10x productivity improvements along some metrics. If that was even slightly true under even a generous interpretation of what it might mean, I’d expect an actual breakthrough, not the ability to churn out little things

What does a breakthrough look like?
Some examples:

- The first web browser

- the first web browser with images

- typescript

- react

- rust

- Fil-C

- doom

- quake

- the anamorphic VM, and its follow-ups like HotSpot, and even competitors/copycats like J9, V8, JSC, etc

- Fortnite battle royale

- Roblox

- thefacebook

- ChatGPT

- Claude code

I know that’s quite a range and that’s intentional.

Anyway, I think we’ll know it when we see it.

Reading through that list. None of those were breakthroughs when they first came out. It took time, in some cases a long time for them to become good.
- Completing the full CL implementation of Emacs or better still finish Lem.

- Complete GuileMacs, the Guile implementation of Emacs. As AI is supposedly much more capable than Humans, it would be great if the above mentioned implementations are even more efficient and feature rich than Emacs!

- Something like Android (maybe even a clone?) with the Java Layer removed and replaced with CL and with Linux kernel still intact. Basically CL over Linux as opposed to the Java over Linux in Android.

- For fun, an implementation of the Lisp machines' OS with Lisp all the way down though Assembly is allowed for critical pieces. It should be a full blown modern Desktop with equivalents of what users expect from a modern OS ...

The LLM+Harness mostly helps with execution.

These are new products (generally) and that's a different class of problem.

It is possible that since LLM+harness helps with execution then we should see more experiments.

Even then we should be able to see things that previously were not possible because they took too much effort.

For example NPCs in games that have complexity that previously was not possible.

Good games often push the boundaries a bit, so should be a good example.

Of course now we can start arguing that there isn't a lot of investment into gaming currently, because it all goes into AI. Too bad.

we're still at least 3 years too early for that. games usually are in a 5+ year dev cycle, so even if AI made gamedev 2x faster, we're still not at the point where the first opus 4.5 games are out
Massive productivity gains.
Yeah.

To play devils advocate, computers didn’t translate to massive productivity gains until long after businesses adopted them. There was that quote from ’87: "you can see the computer age everywhere but in the productivity statistics"

Maybe we’re seeing something like that right now with AI?

Who knows man

This is absolutely the right vision imo.

Personally, I'm seeing massive improvements to my workflow and the quality of the product I'm shipping. I'm using AI to crank out far more tests than I used to be able to write, and I am using AI to analyze results with far more fidelity and speed than I could ever have done myself. That means I have more quality time.

But this will change, because the meaning of software development will change to expect, nay to require AI use. I've heard this is already happening at e.g. Google. The expectation of what can be achieved by tinkerers and by professionals will change. The expectation of what it means to interact with software via your own agents will change and will become commonplace. Apple still hasn't figured out the local agent on the iPhone, but they will. 2027 is not going to feel at all like 2025.

But is any of that a fundamental change? It sure feels fundamental to me, but maybe that's because my everyday has totally changed, but the product I am responsible for has not. Yet. The product I am responsible for operates in critical infrastructure where I personally hope AI never has deep roots, but maybe that's just me. I don't think using AI to build a system that is offline from any AI is the same as depending on an AI to make realtime decisions for critical infrastructure.

"That means I have more quality time."

For now... the shareholders demand managers get the max out of every employee. Throw the force of competition etc into the mix and yeah labour isn't going to benefit all that much.

You are absolutely right. It will be a small window in the development business. Enjoy it if you can!
Perhaps it’s a generational change? People who grew up with computers went on to be more productive with them, something like that might happen with AI too.
Efficiency and productivity in relation to final goods measured in GDP aren't the same thing.

Its yet to be determined just how 'efficient' people are with LLM's as its not really a one-person thing - the true measure is based on an entire collection of people's output.

Startups being rapidly efficient doesn't mean much in relation to the overall economy.

Great comment. I think the answer is Jevons Paradox, as usual

https://en.wikipedia.org/wiki/Jevons_paradox

How about a Windows file browser that opens in less than 5 seconds.
That sounds like a your-system issue. I hit Win+E (admittedly on an old Win10 box) and it instantly pops up an explorer window.
Try win11
Nah… they killed WMR in it, which I need for the only reason I’m running Windows in the first place: iRacing in VR.
FilePilot has been a thing for a while now
That started pre-llm
> exactly zero software breakthroughs since vibe coding started, other than vibe coding itself

Generative AI is meant to be a mimic - Richard Sutton

https://x.com/RichardSSutton/status/2061216087744946656

The breakthroughs in mass state surveillance are coming, never fear.
What does a software breakthrough look like in your opinion?

If you get yourself to define it, maybe you'll find it achievable :)

Solved a bunch of Erdos problems.
What would qualify as a breakthrough for you?
openAI has how many employees and the chatGPT app has 1 billion MAU
Vibe coding is the breakthrough. There's always been "no-code" solutions to problems in various business domains, but they were invariably janky, underpowered, and/or overpriced. Now we have a way for domain experts to go directly from ACTUAL natural language directly to implementation in a real programming language, fully automated, in minutes or hours. How is that not a science-fiction level breakthrough? In 2011 if anyone had said that would be possible "in 15 years", I think most professionals at the time would not have replied with "yeah it's coming but your timeline is off". It would have been "you have no fucking idea what you're talking about".