Hacker News new | ask | show | jobs
by khazhoux 72 days ago
Me: has to babysit every feature for hours in Claude Code, building a good plan but then still iterating many many times over things that need to be fixed and tweaked until the feature can be called done.

Bloggers: Here's how we use 3,000 parallel agents to write, test, and ship a new feature to production every 17 minutes in an 8M-LOC codebase (all agent-generated!).

... I'm doing something wrong, or other people are doing something wrong?

2 comments

> 8M-LOC codebase

I think this is the difference. These toy examples of using parallel agents are *not* running against large codebases, allowing them to iterate more effectively. Once you are in real codebases (>1M LoC), these systems break down.

(author here) I strongly agree that these systems start to break down once the code base gets larger (we've seen that with our own projects)

But our reaction to it has been to say "ok, well the best practice in software engineering is to make small, well-isolated components anyway, so what if we did that?"

We've been trying to really break things apart into smaller pieces (and that's even evident in mngr, where much of the code is split out into separate plugins), and have been having a ton of success with it.

I realize that that might not be an option for more brownfield / existing / legacy projects, but when making something new, I've really been enjoying this way of building things.

To an extend you are likely doing something wrong.

I understand that the natural instinct is to correct the output when you see your agent doing something wrong.

That is not productive.

The instinct should be to tweak the agent to do it right.

At this point I am almost not writing any code in an enterprise code base.

> The instinct should be to tweak the agent to do it right.

I'm extremely doubtful of this. It doesn't save time to tell it "you have an error on line 19", because that's (often) just as much work as fixing the error. Likewise, saying "be careful and don't make mistakes" is not going to achieve anything. So how can you possibly tweak the agent to "do it right" reliably without human intervention? That's not even a solved problem for working with _humans_ who don't have the context window limitations, let alone an LLM that deletes everything past 30k tokens.

Are you seriously interested in the answer, or are you just mad?

I could give you some pointers, but will only type it out if there is a point

Not GP, but I would love pointers on precisely this problem
It is about tweaking inline documentation to make sure that

1. It is not ambiguous 2. It is as complete as possible.

I am surprised that I got down voted for proposing the improve a code base such that agents can run on it as a means to increased productivity.

I'm not touching code. I'm trying out the feature, and there's any number of things to tweak (because I missed some detail during planning, or agent made bad assumption, etc).
> The instinct should be to tweak the agent to do it right.

Ah, yes; must always remember to add "And don't make any mistakes" into the prompt /s

I am not entirely sure what you are referring to.

Improving the agent means improving the code base such that the agent can effectively work on it.

It can not Com as a surprise that an agent is better at working on a well documented code base with clear architecture.

On the other hand, if you expect that an agent can add the right amount of ketchup to your undocumented speghatti code, then you will continue to have a bad time.