Hacker News new | ask | show | jobs
by Me1000 913 days ago
Not OP and have no insight, but the thing that caused it to click for me was when I heard “this token attends to that token”. Basically, there’s a new value created that represents how much one thing (in an LLM its tokens) cares about another thing.

Saying “attends to” vs “attention” helped clarify (for me) the mechanics of what’s going on.