Hacker News new | ask | show | jobs
by nerdponx 759 days ago
Right, but this is also a core limitation in the transformer architecture. You only have very short-term memory (context) and very long-term memory (fixed parameters). Real minds have a lot more flexibility in how they store and connect pieces of information. I suspect that further progress towards something AGI-like might require more "layers" of knowledge than just those two.

When I read a book, for example, I do not keep all of it in my short-term working memory, but I also don't entirely forget what I read at the beginning by the time I get to the end: it's something in between. More layered forms of memory would probably allow us to return to smaller context windows.

1 comments

I mean we have contexts now so large it dwarfs human short term memory right?

And then in terms of reading a book, a model's training could be updated with the book, right?