|
|
|
|
|
by vunderba
819 days ago
|
|
Make sure you're using a SOTA embedding model (UAE, embedding-ada-002, etc) that is capable of creating a vector from a reasonably large token size, see here for comparisons:
https://huggingface.co/spaces/mteb/leaderboard Experiment with a "sliding scale" around the book (paragraphs, pages, etc). Try to use a graph to relate book sections, etc. Consider setting up a tuner with well defined questions and answers to search for optimality around embeddings. |
|