Hacker News new | ask | show | jobs
by whstl 26 days ago
I don't believe this is a context problem.

Claude Opus 4.7 has a very large context compared to itself, but IME it is the worst at following instructions, and completely disregards the (small) preferences prompt, even in the first or second message, even if the messages are just a few characters long.

IMO this is entirely a training problem.

3 comments

Isn't a large context window still a problem though? At the upper bound, the more you put in the more each sentence washes out within that window?
I’m not talking about large amounts of text, I’m talking about a couple sentences back and forth.

It disregards things like “no follow up questions”.

Haiku, for example doesn’t.

This bias is a very human thing, actually now that I think about it. You just disregarded the “even if the messages are just a few characters long”. :)

haha! yes i read too fast but i did read it and i took "message is small" to mean the message you want followed within the large context, not the entire context is just a small message.

funny though it is a case in point: language is hard. and i get to hide behind being "preoccupied" . i wonder if llms have their own sense of preoccupation hmmm.

It's probably some internal conflict between following the original training and following user prompts.

Also reminds me of the gremlin issue with GPT. An (internal) prompt saying "don't say gremlins" wasn't enough.

Codex compaction is way better imo.

I've had many long-running sessions and it doesn't suffer the same retardation (the act of delaying, slowing down, or hindering progress) that Opus does.

The quality stays consistent and it actually seems to follow the instructions, todos, etc. even after multiple compactions.

if you look at claude code, it now says compaction is happening constantly, which is likely why
If compaction is throwing away crucial prompting instructions even when it's at a 1% of maximum token usage (like my example), then it's a software bug, not an LLM artifact.
Doesn't compaction invalidate token caching, btw?
I don't see how this has anything to do with my message, sorry.