Y
Hacker News
new
|
ask
|
show
|
jobs
by
famouswaffles
618 days ago
For a transformer, context is already always being repeated every token. They can fetch information that
became
useful anytime they want. I don't see what problem there is to solve here.
1 comments
tsimionescu
618 days ago
For a transformer, context is limited, so the same kind of problem applies after you exceed some size.
link