| HN Mirror

Part of the confusion here is that some people (apparently including you) use the word "LLM" to refer to the entire system that's built up around the language model itself, while others (like OP) are specifically referring to the large language model.

The large language model's context window absolutely is ephemeral. By the time inference is begun all you have is a giant vector that represents the context to date. This means that the model itself does not have the text available to look at, it only has the encoded "memory" of that text.

OP is simply saying that the underlying model is unsuitable for solving problems like this directly, so it makes a bad example for how models don't use their context effectively. A production grade AI agent should be able to solve problems like this, but it will likely do that through external scaffolding, not through improvements to the model itself, whereas improvements to the context window will probably need to occur at the model level.