| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rozap 1064 days ago
	So this dumps the documents returned from the vector store into a prompt to the LLM. How does it work when there are many documents returned? What's the upper limit there?

1 comments

jasonwcfan 1064 days ago

Yep. We use LangChain's basic text splitter to chunk the documents and the QA chain to stuff it into the prompt. But AFAIK it doesn't check for context length so that's a piece that's still missing.

Upper limit depends on the model, Llama 2 is 4k including the prompt.

link