Hacker News new | ask | show | jobs
by yawnxyz 671 days ago
Claude gave me something similar, except these were both used, and somehow global variables, and it got confused about when to use which one.

Asking it to refactor / fix it made it worse bc it'd get confused, and merge them into a single variable — the problem was they had slightly different uses, which broke everything

I had to step through the code line by line to fix it.

Using Claude's still faster for me, as it'd probably take a week for me to write the code in the first place.

BUT there's a lot of traps like this hidden everywhere probably, and those will rear their ugly heads at some point. Wish there was a good test generation tool to go with the code generation tool...

2 comments

One thing I've found in doing a lot of coding with LLMs is that you're often better off updating the initial prompt and starting fresh rather than asking for fixes.

Having mistakes in context seems to 'contaminate' the results and you keep getting more problems even when you're specifically asking for a fix.

It does make some sense as LLMs are generally known to respond much better to positive examples than negative examples. If an LLM sees the wrong way, it can't help being influenced by it, even if your prompt says very sternly not to do it that way. So you're usually better off re-framing what you want in positive terms.

I actually built an AI coding tool to help enable the workflow of backing up and re-prompting: https://github.com/plandex-ai/plandex

As someone who uses LLMs on my hobby projects to write code, I’ve found the opposite. I usually fix the code, then send it in saying it is a refactor to clarify things. It seems to work well enough. If it is rather complex, I will paste the broken code into another conversation and ask it to refactor/explain what is going on.
Fixing the mistake yourself and then sending the code back is a positive example, since you're demonstrating the correct way rather than asking for a fix.

But in my experience, if you continue iterating from that point, there's still a risk that parts of the original broken code can leak back into the output again later on since the broken code is still in context.

Ymmv of course and it definitely depends a lot on the complexity of what you're doing.

I’m attempting to keep the context ball rolling by reiterating key points of a request throughout the conversation.

The challenge is writing in a tone that will gently move the conversation rather than refocus it. I can’t just inject “remember point n+1” and hope that’s not all it’ll talk about in the next frame.

If nothing else, LLMs have helped me understand exactly why GIGO is a fundamental law.

I'd refer you to a comment I made a few weeks ago on an HN post, to the same effect, which drew the further comment from gwern here:

https://news.ycombinator.com/item?id=40922090

LSS: metaprogramming tests is not trivial but straightforward, given that you can see the code, the AST, and associated metadata, such as generating test input. I've done it myself, more than a decade ago.

I've referred to this as a mix of literate programming (noting the traps you referred to and the anachronistic quality of them relative to both the generated tests and their generated tested code) wrapped up in human-computer sensemaking given the fact that what the AI sees is often at best a lack in its symbolic representation that is imaginary, not real; thus, requiring iterative correction to hit its user's target, just like a real test team interacting with a dev team.

In my estimation, it's actually harder to explain than it is to do.