|
|
|
|
|
by halyax7
601 days ago
|
|
an issue I've seen in several RAG implementations is assuming that the target documents, however cleverly they're chunked, will be good search keys for incoming queries. Unless your incoming search text looks semantically like the documents you're searching over (not the case in general), you'll get bad hits. On a recent project, we saw a big improvement in retrieval relevance when we separated the search keys from the returned values (chunked documents), and we used an LM to generate appropriate keys which were then embedded. Appropriate in this case means "sentences like what the user might input if theyre expecting this chunk back" |
|