| HN Mirror

Right, but consider its 'evaluation' during training: During training it is constantly seeing stuff where the context is out of the window and the correct completion confidently answers, so the model is trained to do the same.

I think this is very tricky to solve conceptually (since the human authors don't have the same input event horizon problem), but it could be (and has been) papered over by making the context bigger.