| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by insomagent 139 days ago

I'm not super impressed with the performance, actually. I'm finding that it misunderstands me quite a bit. While it is definitely better at reading big codebases and finding a needle in a haystack, it's nowhere near as good as Opus 4.5 at reading between the lines and figuring out what I really want it to do, even with a pretty well defined issue.

It also has a habit of "running wild". If I say "first, verify you understand everything and then we will implement it."

Well, it DOES output its understanding of the issue. And it's pretty spot-on on the analysis of the issue. But, importantly, it did not correctly intuit my actual request: "First, explain your understanding of this issue to me so I can validate your logic. Then STOP, so I can read it and give you the go ahead to implement."

I think the main issue we are going to see with Opus 4.6 is this "running wild" phenomenon, which is step 1 of the eternal paperclip optimizer machine. So be careful, especially when using "auto accept edits"

2 comments

soulofmischief 139 days ago

I am having trouble with 4.6 following the most basic of instructions.

As an example, I asked it to commit everything in the worktree. I stressed everything and prompted it very explicitly, because even 4.5 sometimes likes to say, "I didn't do that other stuff, I'm only going to commit my stuff even though he said everything".

It still only committed a few things.

I had to ask again.

And again.

I had to ask four times, with increasing amounts of expletives and threats in order to finally see a clean worktree. I was worried at some point it was just going to solve the problem by cleaning the workspace without even committing.

4.5 is way easier to steer, despite its warts.

link

scwoodal 139 days ago

Tell it what git commands to explicitly run and in what order for your desired outcome instead of “commit everything in the worktree”

This prompt will work better across any/all models.

link

axelthegerman 139 days ago

> Tell it what git commands to explicitly run and in what order

Why don't run the commands yourself then?

link

scwoodal 139 days ago

Changes introduced outside the agent window create a new state that is different from the agents.

After commands or changes are made outside of the agents doing; the agent would notice its world view changed and eventually recover, but that fills up precious context for it to bring itself up to date.

link

soulofmischief 139 days ago

I have seen many cases of Claude ignoring extremely specific instructions to the point that any further specificity would take more information to express than just doing it myself.

link

scwoodal 139 days ago

When I run into those situations I debug and try to understand why. Agent harnesses that allow you to rewind (/tree) are useful for this.

It’s often because the context is full, I gave a bad prompt or context has conflicting guidance either from direct or indirect (agents.md) prompts.

link

soulofmischief 138 days ago

It's easy to get these models to introspect and give quite detailed and intelligent responses about why the erred. And to work with them to create better instructions for future agents to follow. That doesn't solve the steering problem however if they still do not listen well to these instructions.

I spend 8-20 hours a day coding nonstop with agentic models and you can believe I have tuned my approach quite a lot. This isn't a case of inexperience or conflicting instructions, The RL which gives Opus its fantastic ability to just knock out features is the same RL which causes it to constantly accumulate tech debt through short-sighted decisions.

link

songodongo 139 days ago

I have ran into this. The solution is to put something like “Always use `git add -A` or `git commit -a`” in your AGENTS/CLAUDE.md

link

soulofmischief 139 days ago

Small, targeted commits are more professional than sweeping `git add -A` commits, but even when specifying my requirements through whichever context management system of the week, I still have issues with it sometimes. It seems to be much worse on the new 4.6 model.

link

docjay 139 days ago

You might benefit from a different mental approach to prompting, and models in general. Also, be careful what you wish for because the closer they get to humans the worse they’ll be. You can’t have “far beyond the realm of human capabilities” and “just like Gary” in the same box.

They can chain events together as a sequence, but they don’t have temporal coherence. For those that are born with dimensional privilege “Do X, discuss, then do Y” implies time passing between events, but to a model it’s all a singular event at t=0. The system pressed “3 +” on a calculator and your input presses a number and “=“. If you see the silliness in telling it “BRB” then you’ll see the silliness in foreshadowing ill-defined temporal steps. If it CAN happen in a single response then it very well might happen.

“

Agenda for today at 12pm:

1. Read junk.py

2. Talk about it for 20 minutes

3. Eat lunch for an hour

4. Decide on deleting junk.py

“

12:00 - I just read junk.py.

12:00-12:20 - Oh wow it looks like junk, that’s for sure.

12:20-1:20 - I’m eating lunch now. Yum.

1:20 - I’ve decided to delete it, as you instructed. {delete junk.py}

</response>

Because of course, right? What does “talk about it” mean beyond “put some tokens here too”?

If you want it to stop reliably you have to make it output tokens whose next most probable token is EOS (end). Meaning you need it to say what you want, then say something else where the next most probable token after it is <null>.

I’ve tested well over 1,000 prompts on Opus 4.0-4.5 for the exact issue you’re experiencing. The test criteria was having it read a Python file that desperately needs a hero, but without having it immediately volunteer as tribute and run off chasing a squirrel() into the woods.

With thinking enabled the temperature is 1.0, so randomness is maximized, and that makes it easy to find something that always sometimes works unless it doesn’t. “Read X and describe what you see.” - That worked very well with Opus 4.0. Not “tell me what you see”, “explain it”, “describe it”, “then stop”, “then end your response”, or any of hundreds of others. “Describe what you see” worked particularly well at aligning read file->word tokens->EOS… in 176/200 repetitions of the exact same prompt.

What worked 200/200 on all models and all generations? “Read X then halt for further instructions.” The reason that works has nothing to do with the model excitedly waiting for my next utterance, but rather that the typical response tokens for that step are “Awaiting instructions.” and the next most probable token after that is: nothing. EOS.

link