|
|
|
|
|
by bpiche
990 days ago
|
|
Appears that this is an open source implementation of the same "Efficient Streaming Language Models with Attention Sinks" paper from MIT, linked here 7 days ago. Published on Sept 29, 2023. https://news.ycombinator.com/item?id=37740932 |
|