|
|
|
|
|
by ilaksh
1205 days ago
|
|
5 pages fits in the context window. How exactly do you plan on expanding the limit? Without explanation we have to assume you haven't completely solved your core technical challenges. In my testing the biggest challenges with using for example OpenAI embeddings with cosine similarity or something are A) figuring out the section breaks or right chunk size so that information stays in context and B) retrieving enough chunks to get the correct hit for a query without having too much extraneous information that confuses it. I think that it's hard to make a parser that most optimally slices up arbitrary documents. Since you have some larger documents preloaded I assume for those you have the embeddings search. But for user uploads you are skipping that now and just feeding all of the text extracted from the PDF into the prompt along with the query. |
|