| HN Mirror

> Theoretically with an infinite context window a model would just work fine forever by shoving the entire conversation history into context with each request. But a message search/retrieval makes a lot more sense.

Nope, with an infinite context window the LLM would take forever to give you an answer. Therefore it would be useless.

We don't really have such a thing as a context window, it's an artifact of LLM architecture. We are building a ton of technology around it but who's to say it's the right approach?

Maybe the best AIs will only use a very tiny LLM for actual language processing while delegating storage and compression of memories to something that's actually built for that.