Hacker News new | ask | show | jobs
by wolttam 84 days ago
The sky seems like the limit to me. 100M doesn't actually seem like that much when you get into vision models or embodied robots operating with contexts on the order of several days or weeks.

The more we can drive towards selective attention over larger and larger sets of "working memory", the better, I think.

1 comments

Maybe the mechanism for memory is only tangentially related to the context window.

I suspect cleverer mechanisms of context injection/pruning/updating would result in effective memory more so than my suspicion increasing the context window forever will do, regardless of what tricks we apply to distil attention over it.

There is probably a lot of low hanging fruit in this area.