| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jinko-niwashi 104 days ago

Your "don't fucking touch that file" experience is the exact pattern I kept hitting. After 400+ sessions of full-time pair programming with Claude, I stopped trying to fix it with prompt instructions and started treating it as a permissions problem.

The model drifts because nothing structurally prevents it from drifting. Telling it "don't touch X" is negotiating behavior with a probabilistic system — it works until it doesn't. What actually worked: separating the workflow into phases where certain actions literally aren't available. Design phase? Read and propose only. Implementation phase? Edit, but only files in scope.

Your security example is even more telling — the model folding under minimal pushback isn't a knowledge gap, it's a sycophancy gradient. No amount of system prompting fixes that. You need the workflow to not ask the model for a judgment call it can't be trusted to hold.