| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by enjeyw 159 days ago

One of the big problems with Attention Mechanisms is that the Query needs to look over every single key, which for long contexts becomes very expensive.

A little side project I've been working on is to train a model that sits on top of the LLM, looks at each key and determines whether it's needed after a certain lifespan, and evicts it if possible (after the lifespan is expired). Still working on it, but my first pass test has a reduction of 90% of the keys!

https://github.com/enjeyw/smartkv

1 comments

krackers 158 days ago

Is this not similar to DeepSeek lighting indexer

link