|
|
|
|
|
by hdlothia
750 days ago
|
|
So the part of RAG that's tripping me up right now is vector search and familiarity scores. Does anyone have a good resource to learn more about this? I've been using this as a starter. https://developers.cloudflare.com/workers-ai/tutorials/build... I put in text but I feel like my conception of what should get high relevancy scorrs doesn't match the percentages that come out. The article talks about full text search and meta data so maybe that's the path I should be taking instead of vector search? Where would I store the Metadata in this case? A regular db? I wish articles like this would go into more details about the nitty gritty. But I appreciate high level overview in the article as well. |
|
A good overview is chapter 6 of the Stanford NLP group's IR book [0].
Engineering LLMs still requires a good foundation in the basics of ML/NLP so it's worth the time to catch up a bit.
0. https://web.archive.org/web/20231207074155/https://nlp.stanf...