|
|
|
|
|
by yaj54
584 days ago
|
|
> 3. Hypothetical answer generation from a query using an LLM, and then using that hypothetical answer to query for embeddings works really well. I've been wondering about that and am glad to hear it's working in the wild. I'm now wondering if using a fine-tuned LLM (on the corpus) to gen the hypothetical answers and then use those for the rag flow would work even better. |
|
Interestingly, going both ways: generate hypothetical answers for the query, and also generate hypothetical questions for the text chunk at ingestion both increase RAG performance in my experience.
Though LLM-based query-processing is not always suitable for chat applications if inference time is a concer (like near-real time customer support RAG), so ingestion-time hypothetical answer generation is more apt there.
1. https://aclanthology.org/2023.acl-long.99/