Hacker News new | ask | show | jobs
by Van_Chopiszt 994 days ago
The authors just uploaded a FAQ section, which may clarify some of the confusions: https://github.com/mit-han-lab/streaming-llm/blob/main/READM...
1 comments

Nice update. I think the key question they added that clarifies a lot is #3 (quoted below)

    Can I input an extensive text, like a book, into StreamingLLM for summarization?

    While you can input a lengthy text, the model will only recognize the latest tokens. Thus, if a book is an input, StreamingLLM might only summarize the concluding paragraphs, which might not be very insightful. As emphasized earlier, we neither expand the LLMs' context window nor enhance their long-term memory. StreamingLLM's strength lies in generating fluent text from recent tokens without needing a cache refresh.
So instead of chunks of tokens, we can input stream of tokens and then some point say "LLM take a wheel". So it is very nice but not revolutionary.