|
|
|
|
|
by 112233
184 days ago
|
|
Is claude code with both Sonnet and Opus agentic enough? Because it is constantly finding creative ways to ignore direct, repeated instructions ("user asked X but it is hard, let's do Y instead"), implement fake tests ("feature X is complex. we need to test it completely. let's write script that will create files that feature X would have created, then test that files exist"), sabotage and delete working code ("we need to track FD of the open file (runs strace). The FD is 5 (hardcodes 5 in the code instead of implementing anything useful) tests pass now!") I have not experienced the level of malice and sweet-talking work avoidance from anyone. It apologizes like an alcoholic, then proceeds doubling down. Can you force it to produce actually useful code? Yes, by repeatedly yelling at it to please follow the instructions. In the process, it will break, delete, or implement hard to find bugs in rest of the codebase. I'm really curious, if anyone actually has this thing working, or they simply haven't bothered to read the generated code |
|
With anything above a toy project, you need to be really good with context window management. Usually this means using subagents and scoping prompts correctly by placing the CLAUDE.md files next to the relevant code. Your main conversation's context window usage should pretty much never be above 50%. Use the /clear command between unrelated tasks. Consider if recurring sequences of tool calls could be unified into a single skill.
Instead of sending instructions to the agent straight away, try planning with it and prompting it to ask your questions about your plan. The planning phase is a good place to give Claude more space to think with "think > think hard > ultrathink". If you are still struggling with the agent not complying, try adding emplasis with "YOU MUST" or "IMPORTANT".