Hacker News new | ask | show | jobs
by funnyfoobar 173 days ago
The process you have described for Codex is scary to me personally.

it takes only one extra line of code in my world(finance) to have catastrophic consequences.

even though i am using these tools like claude/cursor, i make sure to review every small bit it generated to a level, where i ask it create a plan with steps, and then perform each step, ask me for feedback, only when i give approval/feedback, it either proceeds for the next step or iterate on previous step, and on top of that i manually test everything I send for PR.

because there is no value in just sending a PR vs sending a verified/tested PR

with that said, I am not sure how much of your code is getting checked in without supervision, as it's very difficult for people to review weeks worth of work at a time.

just my 2 cents

1 comments

Heya, I’m the author of the post! To be clear I have AI write probably 95% of my code these days, but I review every line of code that AI writes to make sure it meets my high standards. The same rules I’ve always had still apply — to quote @simonw “your job is to deliver code you have proven to work”.

So while I’m enthusiastic about AI writing my code in the literal sense, it’s still my code to understand and maintain. If I can’t do that then I work with AI to understand what was written — and if I can’t then I’ll often give it another go with another approach altogether so I can generate something I can understand. (Most of the time working together to understand the code works better, because I love to learn and am always open to pushing my boundaries to grow — and this process can tuned well to self-directed learning.)

And to quote a recent audit: “this is probably one of the cleanest codebases I’ve ever audited.” I say that emphasize the fact that I care a lot about the code that goes into my codebase, and I’m not interested in building layers of unchecked AI slop for code that goes into my apps.

Thanks for the clarification.

personally it would be too difficult for me to understand large chunks of work, like in your case "a week's worth of code" at a time. just wondering how do you go about it?

second, how do you pass such large PR's to your co-workers?(if you have any)

So I will state upfront that my current experience is not the most common team dynamic because I'm an indie developer [^1]. But I've worked at many companies — as small as 2 and as large as Twitter — so I am very familiar with the variety of engineering processes.

I can share how I work with agentic systems, because I (and now others) have found it to be very effective. I still have the engineering-like experience of thinking deeply — I've gotten great results across codebases small and large — and almost everyone who I've run a workshop with has come back to me and said that this was a missing piece for them when they work with agentic systems.

I'm the kind of person I alluded to at the end of my blog post when I wrote "Some people couldn’t start coding until they had a checklist of everything they needed to do to solve a problem.", so this description will be representative of that.

1. I start a document in Craft [^2] whenever I think of a great feature, and keep adding to that doc over the next few months whenever I have a new idea. I try to turn that document into something cohesive — imagine something like a PRD without the formality.

2. Then when it comes time to build the feature, I will just sit and write out a prompt (with lots of pointers to source code and relevant screenshots) that considers everything that needs to be built. I'll write out our goals for the feature, how the client should work, how the server should behave, the expected user experience, and anything else that's relevant. That process is really clarifying because it unearths a whole bunch of meaningful context — and context is exactly what a large language model needs!

3. Last but not least I'll simply add something like "Please ask any clarifying questions you may have, or for any additional details that you may find helpful". That leads to questions which I spend anywhere from another 5 to 30 minutes on, which fills in the gaps that I hadn't even considered to consider. And sure that may take time, but now the model has *so many useful details* that most people never add to their context window.

4. Once you have that, the model can act much more surgically than the experience most people have with agentic systems. Since it's so surgical I can go do something else like work on my newsletter, my AI workshops, or even go for a walk. This is why I much prefer to work this way, as opposed to the hands-on process I described Claude Code users [often] preferring in the blog post. (Which as I mentioned there is perfectly fine, just not my cup of tea anymore.)

---

I'd still like to touch on working with people though. I do quite a bit of open source work and there I still follow what people would consider standard processes and best practices. If I'm doing a week's worth of work I still don't want to dump a whole ton of code in one commit, so I'll break everything down into very atomic commits that spell out exactly what I'm doing. I also write lots of documentation, update references, and add tests like a person should.

But there's also nothing to say you have to generate a week's worth of code in one go. It's important to remember that you're in control of how you work. It may be more fitting to define smaller tasks (which will take less time for each independent step) and work on them serially, which you can then hand off to your coworkers one by one.

Ultimately my message is that people still need to exercise their best judgment and think for themselves. AI doesn't change what we've come to accept as best practices, it automates and accelerates them. In fact, the models keep getting better the more they are trained on our best practices, so my assertion is that success using AI seems to correlate well with autonomy, creativity, and critical thinking skills.

Anyhow, long answer for a short question — but I hope it helps! And if there's anything unclear: please ask any clarifying questions you may have, or for any additional details that you may find helpful.

[^1]: https://plinky.app [^2]: https://craft.do