Hacker News new | ask | show | jobs
by dcrazy 107 days ago
Isn’t the purpose of self attention exactly to recognize the relevance of some tokens over others?
1 comments

That may help with tokens being "ignored" while still being in the context window, but not context window size costs and limitations in the first place.