Y
Hacker News
new
|
ask
|
show
|
jobs
by
samus
748 days ago
Current models can already argued to have something like working memory by storing information in little-used parts of the tokens. If placeholder tokens are handed to them that they can use as working memory, performance improves.
https://openreview.net/forum?id=2dnO3LLiJ1
https://news.ycombinator.com/item?id=40329675