|
|
|
|
|
by eitanlebras
64 days ago
|
|
The middleware being adapted from LangChain DeepAgents is interesting... there's a known issue with LangChain's summarization middleware where model.invoke calls during compression leak into the SSE stream as visible assistant turns rather than staying internal. Curious if you hit that and how you handled it, since you're streaming everything over SSE anyway? |
|
we also build a lot on top of it like more accurate token estimation, customized offloading mechanism etc.