| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TACIXAT 22 days ago

This article doesn't address writing code with AI, just code review. My issue with agentic coding is that I make numerous micro-architectural decisions while programming. I almost never have a full spec up front and develop one as I consider what I am writing.

When using Claude Code or Codex, that is all gone. Claude Code is extremely eager to reach the end goal to the point that it feels like a fever dream to write code with it. In the end, I have low confidence about edge cases and fit into the project's architectural and design goals.

On top of that, I enjoy programming, reverse engineering, etc. and I feel that the LLMs, while able to solve some problems or deliver some features, take that fun away. I'm trying really hard to find a workflow with them that I'm confident in, but I fear that workflow is just chat, search, and being a rubber duck for my thoughts.

7 comments

HyperL0gi 22 days ago

> This article doesn't address writing code with AI, just code review. My issue with agentic coding is that I make numerous micro-architectural decisions while programming. I almost never have a full spec up front and develop one as I consider what I am writing.

working with AI forced me to write better specs but the way I write today is very different. I typically open Codex and have Linear MCP connected where my chat with the AI will end up writing the issue. Its a lot of back-end-forth where I tell what I want, the AI does all the code scanning, write something, I correct something, etc

The value for me is exactly that I tell what I want, the AI verify in the actual code if that's the path that makes more sense or not. In the end I have a pretty detailed spec that I'm much more confident is the correct path.

I find the spec easier to review than a huge PR so typically when executing is much faster and aligned with what I want.

The grill-me skill from Matt Pocock is great for this (https://github.com/mattpocock/skills/blob/main/skills/produc...)

link

zahlman 22 days ago

> but I fear that workflow is just chat, search, and being a rubber duck for my thoughts.

That's still a lot of benefit, though. I have to agree with Patrick McKenzie on this one (https://x.com/patio11/status/2058631943785488815):

> If the only impact of LLMs professionally was causing people to "think out loud" in a way which was routinely captured by computer systems and then could be operated on by computer systems, that would by itself be one of the most consequential changes in practice in 100 years

link

aakresearch 22 days ago

> I fear that workflow is just chat, search, and being a rubber duck for my thoughts

This is exactly what I settled upon after my own trying really hard. It is liberating, I have no fear at all!

link

teaearlgraycold 22 days ago

A lot of programming work is well represented in the training data. For that kind of stuff there’s not much to do regarding architectural decisions. I love to run the LLMs on auto for that work. But for anything not well represented in the training data, which could be anything from mundane stuff in PyQT or a truly novel application, keep them on a short leash or forget them altogether.

link

redeye100 22 days ago

> represented in the training data

This isn’t a binary is/isn’t thing though. What if only 80% of my task is, how would I know that the other part isn’t, if I haven’t worked it through fully

What if my task is generally represented, but for my specific context, there are specific details that aren’t?

How would I know until I’ve reasoned through it myself? At that point having the LLM do the work doesn’t add much value

link

photochemsyn 22 days ago

I find using the LLM to generate different git repo skeletons for the same class of project using the 4-5 different programming languages I’m familiar with is really interesting and helpful. Then I ask it to explicitly describe its design decisions for different parts of the small codebase, i.e. what do the internal APIs look like, so that if you make changes in one section of the codebase, you can be sure you don’t accidentally generate problems in another section of the codebase. Only once you’ve worked out all such constraints, clarified dependencies, etc. do you start generating code in each subsection and that’s done using the specific constraints for that section in each prompt, and reviewing all the code. This is also when you generate the tests for each subsection. Finally this is where using a different LLM(s) for code review after the code is written becomes important. It’s a slow process certainly but it seems to work pretty well.

link

ggregoire 22 days ago

> On top of that, I enjoy programming, reverse engineering, etc. and I feel that the LLMs, while able to solve some problems or deliver some features, take that fun away.

Same, I prefer asking one or multiple very technical questions to Gemini, analyze, compare and understand the responses then implement it myself based on what I learned (or just integrate it to the codebase as it is, if I asked it to write a function) than delegating away all the fun to an agent.

link

RevEng 21 days ago

This is why I don't use agent-first platforms like Claude Code. I want to write software with an AI to assist me, not an AI to write my software for me. I don't want an environment whose main mode of operation is instructing an AI to write code - I want a typical IDE where I can continue writing code myself but with an assistant there to consult whenever I want it.

Even then, it's easy to fall into a trap of giving the AI a simple description and letting it fill in the blanks, but I've learned the discipline not to do that, in the same way I learned to think before I speak and design before I write code.

Planning mode is my entry point for almost all code I would have the AI write. I already have in mind what I think I want. I get it to create a detailed plan, which inevitably fills in things I didn't specify and even ask questions I hadn't considered. I iterate on this first revision spec until I think it's ready. This results in a task list. But just like waterfall doesn't mean make a plan and execute it all without looking back, executing this plan is also a stepwise iterative process. I let the AI execute the first step. I check its work. I run some tests and see if it behaves like I thought it would - the same stuff I would do during normal development. If I find issues, I go back to the plan and change it, then continue implementing the revised plan. If the previous step lead me into a dead end, I revert that one step and try again with my revised plan.

The key thing is this: this was my development methodology before AI entered the picture. Nothing has fundamentally changed. What has changed is that the AI provides input at one or more stages of the flow - offering alternatives, asking questions, running tests, researching and debugging - but in every single step the AI does not decide on the final outcome. Even if the AI wrote all of the code, I still review it and test it. Even if it suggested a design, I compare the options and review the referenced documents and decided for myself. Even if it reviews my code and says it would do a hundred things differently, I decide what suggestions I will act on and how. It's no different than I would do with having a coworker giving me ideas, reviewing my work, or making their own attempt at generating a solution. It's all helpful input and if I'm happy with it I will use it as is, but I'm still responsible for every line of code and I still make all of the decisions about what stays and what goes.

I'm sure this sounds wasteful - why use an AI if you have to review it and correct it anyway? For the same reason I delegate tickets to a team of 20 junior developers and don't do them all myself as a principal engineer - I am but one person and they are an army. Even if I have to discuss plans and options with them and review their work, they can spend the same hours I would brainstorming and researching and prototyping and debugging, and I can go over the results with them to make key decisions and make sure we are on the right track. I make the important decisions, but the leg work of getting everything in place is done by someone else who doesn't need as much knowledge or insight or experience. It is a force multiplier. It is the equivalent of a lawyer with a team of paralegals or a professor with a team of researchers and grad students. I can let the AI do the things it does well so that I don't have to, and instead I can spend my time on the things I do well that it can't do.

This is where I think we are going wrong with AI in many areas, but particularly in software development. An AI is not a replacement for an engineer - it's an engineer's assistant. It isn't a source of truth - it's a source of ideas. It's not responsible for the outcome - I am. A team of helpers can make an expert more productive, but that team isn't a substitute for the expert. Likewise, juniors still need to gain the same experiences and learn the same practices and make the same mistakes, because they still need to become experts, but they can use the AI as a guide along that journey rather than having to rely solely on another expert for mentorship.

link