Hacker News new | ask | show | jobs
by Der_Einzige 976 days ago
ChatGPT degrades precisely because they aren't doing anything special to extend their memory beyond the context length.

There are trivial techniques to implement "lossy" memory, such as just average pooling tokens (the same approach used by sentence transformers). Not sure why it's so rare to see this used for condensing a huge amount of context into a prompt. It is effectively "medium" term memory.

2 comments

https://chat.openai.com/share/e367a1de-c28b-4408-aa3d-2e4b85...

Fed chatGPT special numbers, then 3k tokens, then 2k tokens. after that, it was unable to understand any question about the special numbers provided.

At the very least I would average vectors inside single words or word compounds getting a 2-3x reduction in length without much work.