| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wcoenen 25 days ago
	The UX problem is elsewhere I think. Many users probably don't realize that the agent's context window is limited, and that clever compaction is happening regularly to make it seem infinite. But that necessarily means the agent has to forget stuff. As a result, users will keep reusing the same coding or chat session again and again. While it would be better to start fresh for unrelated tasks.

3 comments

whstl 25 days ago

I don't believe this is a context problem.

Claude Opus 4.7 has a very large context compared to itself, but IME it is the worst at following instructions, and completely disregards the (small) preferences prompt, even in the first or second message, even if the messages are just a few characters long.

IMO this is entirely a training problem.

link

apsurd 25 days ago

Isn't a large context window still a problem though? At the upper bound, the more you put in the more each sentence washes out within that window?

link

whstl 25 days ago

I’m not talking about large amounts of text, I’m talking about a couple sentences back and forth.

It disregards things like “no follow up questions”.

Haiku, for example doesn’t.

This bias is a very human thing, actually now that I think about it. You just disregarded the “even if the messages are just a few characters long”. :)

link

apsurd 25 days ago

haha! yes i read too fast but i did read it and i took "message is small" to mean the message you want followed within the large context, not the entire context is just a small message.

funny though it is a case in point: language is hard. and i get to hide behind being "preoccupied" . i wonder if llms have their own sense of preoccupation hmmm.

link

whstl 24 days ago

It's probably some internal conflict between following the original training and following user prompts.

Also reminds me of the gremlin issue with GPT. An (internal) prompt saying "don't say gremlins" wasn't enough.

link

moomoo11 24 days ago

Codex compaction is way better imo.

I've had many long-running sessions and it doesn't suffer the same retardation (the act of delaying, slowing down, or hindering progress) that Opus does.

The quality stays consistent and it actually seems to follow the instructions, todos, etc. even after multiple compactions.

link

8note 24 days ago

if you look at claude code, it now says compaction is happening constantly, which is likely why

link

whstl 24 days ago

If compaction is throwing away crucial prompting instructions even when it's at a 1% of maximum token usage (like my example), then it's a software bug, not an LLM artifact.

link

tacone 24 days ago

Doesn't compaction invalidate token caching, btw?

link

whstl 24 days ago

I don't see how this has anything to do with my message, sorry.

link

poly2it 25 days ago

The author of this post and the readers of this thread probably do understand context window limitations, but are frustrated nonetheless.

link

dust-jacket 24 days ago

Well yeah. And there's little more frustrating than someone telling you not be frustrated because "that's just how it works".

We get how it works. It's just irritating.

link

kolinko 24 days ago

I think the post author is smarter than that.

I usually work with sessions <300k tokens, Opus 4.7 xhigh, and it simply has holes in it's world model, or some strong conditioning here and there, and it sips through regardless of how strong you will say things and how explicit the rules in system prompt will be.

Even with a fresh session, if you bump into one of these things, it will lead you into circles that will be very hard to break out of. And swearing helps a bit.

link