| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alexwwang 22 days ago
	Understandable. You don’t want to lose control to your codebase and don’t trust LLM is competent in handling that fully.

2 comments

lukan 22 days ago

No. Because they still hallucinate at times. Confuse things. Forget things. Or none of the above, as it is anthropomorphizing, but the result is the same. They can make incredible working one shots, you start to trust them, then you trust too much and .. feel the result.

link

alexwwang 22 days ago

Yes. I am fighting with the disobeyance of LLM on working through my pipeline commands. I believe these violations are caused by its hallucinations. So I am still developing a mechanical system to monitor agents’ behaviors automatically. I believe these routines and monitors will play as a set of scaffold to keep leading the LLM on the right way all the time.

link

xenadu02 22 days ago

The percentage of times I prompt claude "what about checking if there are any child processes running?" or "Would using a lock here greatly simplify the design?" only to have myself be correct is approaching 100%. That is it isn't just claude sycophantically agreeing with me. The code itself becomes smaller, simpler, and more reliable with fewer bugs.

The agents tend to produce working code but the larger the scope the bigger the mess they tend to make. They will happily evolve toward a local maxima but leave world-destroying bugs lurking in the implementation.

The other issue is that claude regularly ignores explicit instructions in CLAUDE.md or in prompts. It will "helpfully" decide to just start doing whatever it wants or reinterpret instructions completely differently than it did the last 100 times.

It has nothing to do with losing control or trust. LLMs are not conscious. They have no executive function. They aren't even thinking. They're just models predicting the next word in the script. They are very useful tools but that's all they are: tools.

link

notgenerated 21 days ago

I also feel like we still need to steer Claude. It doesn't always help to have stuff in the CLAUDE.md (even when it's lean). I have a lot of cases where I still need to remind the agent to do something even if it's routine.

To me I think that connects with working longer on the planning and specs. It requires reading and re-reading, but when that's done, implementation is usually much cleaner and adheres to your standards

link

alexwwang 21 days ago

Yes. They are tools. So my approach, at least try to approach is to keep on polishing the skills and check the output of LLM in loops with mcp to alert the abnormality asap so the LLM won’t go to next step to make things worse.

link