Hacker News new | ask | show | jobs
by yunohn 238 days ago
There’s actually research being done in this space that you might find interesting: “attention sinks” https://arxiv.org/abs/2503.08908