| > the idea that technology forces people to be careless I don't think anyone's saying that about technology in general. Many safety-oriented technologies force people to be more careful, not less. The argument is that this technology leads people to be careless. Personally, my concerns don't have much to do with "the part of coding I enjoy." I enjoy architecture more than rote typing, and if I had a direct way to impose my intent upon code, I'd use it. The trouble is that chatbot interfaces are an indirect and imperfect vector for intent, and when I've used them for high-level code construction, I find my line-by-line understanding of the code quickly slips away from the mental model I'm working with, leaving me with unstable foundations. I could slow down and review it line-by-line, picking all the nits, but that moves against the grain of the tool. The giddy "10x" feeling of AI-assisted coding encourages slippage between granular implementation and high-level understanding. In fact, thinking less about the concrete elements of your implementation is the whole advantage touted by advocates of chatbot coding workflows. But this gap in understanding causes problems down the line. Good automation behaves in extremely consistent and predictable ways, such that we only need to understand the high-level invariants before focusing our attention elsewhere. With good automation, safety and correctness are the path of least resistance. Chatbot codegen draws your attention away without providing those guarantees, demanding best practices that encourage manually checking everything. Safety and correctness are the path of most resistance. |
We can't optimize for an objective truth when that objective truth doesn't exist. So while doing our best to align our models they must simultaneously optimize they ability to deceive us. There's little to no training in that loop where outputs are deeply scrutinized, because we can't scale that type of evaluation. We end up rewarding models that are incorrect in their output.
We don't optimize for correctness, we optimize for the appearance of correctness. We can't confuse the two.
The result is: when LLMs make errors, those errors are difficult for humans you detect.
This results in a fundamentally dangerous tool, does it not? Tools that when they error or fail they do so safely and loudly. Instead this one fails silently. That doesn't mean you shouldn't use the tool but that you need to do so with an abundance of caution.
Actually the big problem I have with coding with LLMs is that it increases my cognitive load, not decreases it. Bring over worked results in carelessness. Who among us does not make more mistakes when they are tired or hungry?That's the opposite of lazy, so hopefully answers OP.