| HN Mirror

Nice update. I think the key question they added that clarifies a lot is #3 (quoted below)

    Can I input an extensive text, like a book, into StreamingLLM for summarization?

    While you can input a lengthy text, the model will only recognize the latest tokens. Thus, if a book is an input, StreamingLLM might only summarize the concluding paragraphs, which might not be very insightful. As emphasized earlier, we neither expand the LLMs' context window nor enhance their long-term memory. StreamingLLM's strength lies in generating fluent text from recent tokens without needing a cache refresh.