Hacker News new | ask | show | jobs
by stillpointlab 294 days ago
I'm still calibrating myself on the size of task that I can get Claude Code to do before I have to intervene.

I call this problem the "goldilocks" problem. The task has to be large enough that it outweighs the time necessary to write out a sufficiently detailed specification AND to review and fix the output. It has to be small enough that Claude doesn't get overwhelmed.

The issue with this is, writing a "sufficiently detailed specification" is task dependent. Sometimes a single sentence is enough, other times a paragraph or two, sometimes a couple of pages is necessary. And the "review and fix" phase again is totally dependent and completely unknown. I can usually estimate the spec time but the review and fix phase is a dice roll dependent on the output of the agent.

And the "overwhelming" metric is again not clear. Sometimes Claude Code can crush significant tasks in one shot. Other times it can get stuck or lost. I haven't fully developed an intuition for this yet, how to differentiate these.

What I can say, this is an entirely new skill. It isn't like architecting large systems for human development. It isn't like programming. It is its own thing.

6 comments

This is why I'm still dubious about the overall productivity increase we'll see from AI once all the dust settles.

I think it's undeniable that in narrow well controlled use cases the AI does give you a bump. Once you move beyond that though the time you have to spend on cleanup starts to seriously eat into any efficiency gains.

And if you're in a domain you know very little about, I think any use case beyond helping you learn a little quicker is a net negative.

"It isn't like programming. It is its own thing."

You articulated what I was wrestling with in the post perfectly.

It isn't like programming. It is its own thing.

Absolutely. And what I find fascinating that this experience is highly personal. I read probably 876 different “How I code with LLMs” and I can honestly say not a single thing I read and tried (and I tried A LOT) “worked” for me…

According to most enthusiasts of LLM/agentic coding you are just doing it wrong then.
not sure this is really true/fair, I think what LLM/agentic code enthusiasts will say is that they have found their way to be effective with it while naysayers will fight the "this is sh*t" battle until they are eventually out of the workforce.
So, why do you only opt for that side of the argument? Why not indulge in the side of the naysayers will be able to keep a job after the bubble bursts because they still know how to code by hand? And that exact sentiment is what I was alluding to.

There is maybe some truth to the LLM vibe coding and there maybe is some truth to the “old guard” saying “this is shit”, because this might be shit for very good reasons.

I’ve been doing this sht for 30 years and one thing I can tell you I learned - when you see something as “groundbreaking” (controversial?) as llms and see many people telling you how much more productive they are with it there are almost always two camps:

- those fighting HARD to tell you at the top of their lungs “oh this is sht, I tried it and it is baaaad

- those going “hmmm let me see how I can learn etc to get to the point where I am also a lot more productive, if ____ and ____ can learn it so can I…”

You always want to be in the second camp…

I seem to have forgotten the golden rule to never speak out against LLMs, yet you be subjected to instant downvotes. I don't mind the downvotes, but bring some counterpoints to the discussion and make it worth the platform.

EDIT: typo

one of the likely reasons you are getting downvoted is that you made a snarky remark. (unsolicited) word of advice - you should always listen to the enthusiasts (if you are not one of them), they have figured something out before you did (nothing wrong with that, many people are much smarter than you and I)...
>I haven't fully developed an intuition for this yet, how to differentiate these.

The big issue is that, even though there is a logical side to it, part is adapting to a close system that can change under your feet. New model, new prompt, there goes your practice.

For the longer ones, are you using AI to help you write the specs?
My experience is: AI written prompts are overly long and overly specific. I prefer to write the instructions myself and then direct the LLM to ask clarifying questions or provide an implementation plan. Depending on the size of change I go 1-3 rounds of clarifications until Claude indicates it is ready and provides a plan that I can review.

I do this in a task_descrtiption.md file and I include the clarifications in its own section (the files follow a task.template.md format).

> What I can say, this is an entirely new skill. It isn't like architecting large systems for human development. It isn't like programming. It is its own thing.

It's management!

I find myself asking very similar questions to you: how much detail is too much? How likely is this to succeed without my assistance? If it does succeed, will I need to refactor? Am I wasting my time delegating or should I just do it?

It's almost identical to when I delegate a task to a junior... only the feedback cycle of "did I guess correctly here" is a lot faster... and unlike a junior, the AI will never get better from the experience.