Hacker News new | ask | show | jobs
by wcoenen 25 days ago
The UX problem is elsewhere I think. Many users probably don't realize that the agent's context window is limited, and that clever compaction is happening regularly to make it seem infinite. But that necessarily means the agent has to forget stuff.

As a result, users will keep reusing the same coding or chat session again and again. While it would be better to start fresh for unrelated tasks.

3 comments

I don't believe this is a context problem.

Claude Opus 4.7 has a very large context compared to itself, but IME it is the worst at following instructions, and completely disregards the (small) preferences prompt, even in the first or second message, even if the messages are just a few characters long.

IMO this is entirely a training problem.

Isn't a large context window still a problem though? At the upper bound, the more you put in the more each sentence washes out within that window?
I’m not talking about large amounts of text, I’m talking about a couple sentences back and forth.

It disregards things like “no follow up questions”.

Haiku, for example doesn’t.

This bias is a very human thing, actually now that I think about it. You just disregarded the “even if the messages are just a few characters long”. :)

haha! yes i read too fast but i did read it and i took "message is small" to mean the message you want followed within the large context, not the entire context is just a small message.

funny though it is a case in point: language is hard. and i get to hide behind being "preoccupied" . i wonder if llms have their own sense of preoccupation hmmm.

It's probably some internal conflict between following the original training and following user prompts.

Also reminds me of the gremlin issue with GPT. An (internal) prompt saying "don't say gremlins" wasn't enough.

Codex compaction is way better imo.

I've had many long-running sessions and it doesn't suffer the same retardation (the act of delaying, slowing down, or hindering progress) that Opus does.

The quality stays consistent and it actually seems to follow the instructions, todos, etc. even after multiple compactions.

if you look at claude code, it now says compaction is happening constantly, which is likely why
If compaction is throwing away crucial prompting instructions even when it's at a 1% of maximum token usage (like my example), then it's a software bug, not an LLM artifact.
Doesn't compaction invalidate token caching, btw?
I don't see how this has anything to do with my message, sorry.
The author of this post and the readers of this thread probably do understand context window limitations, but are frustrated nonetheless.
Well yeah. And there's little more frustrating than someone telling you not be frustrated because "that's just how it works".

We get how it works. It's just irritating.

I think the post author is smarter than that.

I usually work with sessions <300k tokens, Opus 4.7 xhigh, and it simply has holes in it's world model, or some strong conditioning here and there, and it sips through regardless of how strong you will say things and how explicit the rules in system prompt will be.

Even with a fresh session, if you bump into one of these things, it will lead you into circles that will be very hard to break out of. And swearing helps a bit.