|
|
|
|
|
by sroussey
917 days ago
|
|
Ask the LLM to summarize the question, then take an embedding of that. I think you can do the same with data you store… summarize it to same number of tokens, then get an embedding for that to save with the original text. Test! Different combinations of summarizing LLM and embedding generation LLM can get different results. But once you decide, you are locked in the summarizer as much as the embedding generator. Not sure is this is what the parent meant though. |
|
Has anyone come across more recent experiments, results, or papers related to this? I'm acquainted with the: - Contriever 2021 paper https://aclanthology.org/2021.eacl-main.74.pdf - Hyde 2022 https://arxiv.org/pdf/2212.10496.pdf
My suspicion is some pre-logic such as is the user's question dense enough then use Hyde with chat history. If anyone has more recent experience with Contrievers, would love to learn more about it!
Feel free to contact me directly on LinkedIn. https://www.linkedin.com/in/christybergman/