| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by blizzardman 819 days ago

Thank you for the recommendation. RAG would make sense Here is my understanding of how to do it

1. Use sentence transformer, transform the entire harry potter or lord of the ring book into embeddings

2. transform query into embeddings -> "why don't gandolf sent the eagles"

3. Find most relevant text using ANN through the query embeddings

4. pipe in the context + query to llama

However the result is not very good, am I missing something in RAG?

1 comments

vunderba 818 days ago

Make sure you're using a SOTA embedding model (UAE, embedding-ada-002, etc) that is capable of creating a vector from a reasonably large token size, see here for comparisons: https://huggingface.co/spaces/mteb/leaderboard

Experiment with a "sliding scale" around the book (paragraphs, pages, etc). Try to use a graph to relate book sections, etc.

Consider setting up a tuner with well defined questions and answers to search for optimality around embeddings.

link