| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by LoganDark 150 days ago

Until something happens that disproves this, my personal belief is that supervision and manual review is one of the best ways to use a coding agent. You don't need to understand everything it does, but you will benefit from a technical background and from at least surface-level knowledge or intuition about what it spits out.

I review every diff Claude Code applies and periodically re-review entire projects as a whole. Through this, I've managed to keep architectures fairly principled with future expansions in mind. I recently managed to essentially two-shot an MLX implementation of a working forward pass for a diffusion language model, based on CUDA source code that is not compatible with my machine. There's more work needed before it's anywhere usable in practice, but the fact that the model is now running at all on my machine is a very impressive start.

For that, I had it study the CUDA source code and write a very detailed document with its analysis of exactly how the model is implemented. This document only had one material flaw. Then it studied MLX for a while and spat out a running forward pass based on the flawed document. The output wasn't of sufficient quality, so I had it insert debug prints throughout the whole inference process to see where it was going wrong. It found and fixed the forward pass and the flaw in the document. I needed no domain experience in LLMs or DLMs for this (although I benefit from some minor past contributions to RWKV.cpp).

Another example is that I recently started getting into SwiftUI, and Claude Code is doing a very good job at demonstrating code patterns and pointing me towards APIs that may solve my problems. It also helped me set up things like API clients (which itself, of course, gave me pointers to all the sorts of documentation I'd benefit from reading in full). I reject a very large fraction of its suggestions, I tweak its plans very frequently and I tell it off a lot from things that either are unidiomatic or are objectively terrible hacks. But it is incredibly useful for menial work, for enumerating possibilities, for quickly scaffolding placeholder content, and for demonstrating patterns I haven't learned yet as they apply to my specific situation. For example, Claude Code quickly learned me how to use NSViewRepresentable, whereas in a past project where I didn't use LLMs, I absolutely struggled to embed a Metal view.

But all that is to say that I'm skeptical of solutions that try to have you describe your idea in plain language; that try to insulate you from the code; or that make the lie that you just don't have to worry about it. If you work at all on the kinds of projects I work on, which are chock-full of reverse engineering and an obsessive focus on tightly-controlled design and idiomatic code, I firmly believe that treating the code like a black-box is not the way to do it. I don't know if Glaze truly hides the code, but I don't see any mention of it in the trailer video and that makes me feel a little dismissive.