Hacker News new | ask | show | jobs
by eitanlebras 64 days ago
The middleware being adapted from LangChain DeepAgents is interesting... there's a known issue with LangChain's summarization middleware where model.invoke calls during compression leak into the SSE stream as visible assistant turns rather than staying internal. Curious if you hit that and how you handled it, since you're streaming everything over SSE anyway?
1 comments

we encountered that issue and fixed it + a separate sse event type to signal start/end of summarization for ui/ux.

we also build a lot on top of it like more accurate token estimation, customized offloading mechanism etc.

Oh nice, that separate SSE event seems like a big improvement for UX.