Y
Hacker News
new
|
ask
|
show
|
jobs
by
dcrazy
107 days ago
Isn’t the purpose of self attention exactly to recognize the relevance of some tokens over others?
1 comments
kimixa
107 days ago
That may help with tokens being "ignored" while still being in the context window, but not context window size costs and limitations in the first place.
link