Hacker News new | ask | show | jobs
by r_lee 3 hours ago
if we are at 10x with AI and near AGI or ASI, then how is it possible that these products (Codex, Claude Code CLI) are still such garbage?

shouldn't this "agentic AI revolution" have long solved this already?

no way they're over there saying "we are on it plz wait" or that "it's too much effort"?

8 comments

This is the biggest elephant in the room I have seen in my decade+ career. At the same time, look how bad Apple is in software compared to its hardware... It's not an AI only problem, it's almost like software in general gets a free pass on being very unsafe or low quality because no one wants to face the same "profit reducing red tape" that civil engineers or similar face.
Anthropic were the progenitors of the Model Context Protocol. Claude Code does not fully implement the client end of the protocol. A protocol; a literal pre-defined spec that an agent should be able to one-shot. Neither does Codex. Codex does not implement MCP Prompts.

(I want Codex to implement MCP Prompts because then we have one central way to ship skills from a server).

The fact that neither platform can implement a protocol given what is functionally infinite frontier model tokens really says a lot. I do not care what kind of random project some influencer can ship with a swarm of 1000 agents. If you cannot make the basics work, it is a farce.

It still boggles my mind that Anthropic would invent the MCP protocol but not fully implement it.

Especially when fully implementing it (prompts, resources, tools) is easily done in harnesses that don’t ship with MCP but allow good extension / modification like Pi.

Claude not being able to see its own usage or self invoke slash commands is also very frustrating.

Like anything, you have to decide between polish vs switch to any other task in the queue. If you choose too much from the latter, then polish suffers, yet that's a human thing.

Also, Codex and Claude Code aren't as bad as people say. I think most of the noise is embellished by the "hah see? AI sucks" angle.

It's kind of like how HNers would claim to your face that you can't actually build anything with Javascript and Node.js (JS just sucks too much), then they'd list off a few footguns that were supposed to demonstrate why. In other words, champing at the bit for JS to lead people to catastrophize issues that were pretty mediocre.

>Like anything, you have to decide between polish vs switch to any other task in the queue

Why do you "have to decide"? Let some agents go at both of those, isn't that what they claim people can just do?

>Also, Codex and Claude Code aren't as bad as people say. I think most of the noise is embellished by the "hah see? AI sucks" angle.

Why shouldn't it? They're not the ones making the extraordinary claims.

The "AI revolution" feels like it's creating a bunch of ultra-smart AI models are scarily good at cracking most of human-created security (Mythos), but also happen to be careless snobs that just leave litter and mess in their wake.
Because vibe coding is a toy… thats the secret.

You can use it to accelerate development certainly, but that requires careful change->review cycles. The developer still needs to be in heavy control, versus vibe coding having an agent own the code base.

If the code churn is high the investment to refactoring etc is less beneficial than may be obvious. I don't remember the details but I heard in some podcast that the code base of Claude Code changes so fast that any piece of code won't be there for long..
In other words it's an ever moving vibe fest, with random bugs and misbehaviors each time they roll the dice...
Yes, it’s very characteristic of gen-AI era.
If they respected their users they’d at least pin some versions that are more stable.
You are asking too many good questions.
The products generally work just fine on my MacBook.

I have not encountered major issues in either the Claude Code CLI, the Codex Desktop app, or Claude Desktop app.

They generally get the job done. I don't measure disk writes or analyze the GPU usage.

A simple explanation is that they are "good enough" for most people and they have better things to do. Even if tomorrow I was 100 times as productive, I still wouldn't have time to do literally everything and I would have to prioritize.
You might not.

But the Claude Code team has ONE job.

And they have full access to a platform that they advertise as "humanity-threat" level good, and claim that it can automate everything code related...

I think they have more than one job, they have to balance new features with improving the software itself. And Anthropic has to balance investing resources into Claude Code vs on infra or other things.

Not that I'm happy with the current state of things, in fact I'm quite sad that improvements in capacity to do things doesn't translate into better quality.