|
|
|
|
|
by brianykim
80 days ago
|
|
Good company-ready RAG benefits a lot from some basic pre-processing/labeling of the data instead of solely dumping unstrucuted data into a vector database and calling it a day. Different heuristics and different schemas of embedded data go a long way in ensuring quality and flexibility of querying. Then you can do ReAG, which let's you reason on top of the top K intelligently. And things like memory knowledge graph services as well, can help reduce your search space, and provide extra context over time that gets updated, beyond just treating static docs as sources of truth. You can give it more context as to how it should interpret older docs, vs. newer docs, and allowing users (based on correctness or not) to help audit the what is embedded in your RAG systems. I appreciate the thorough write up, but doing RAG systems seriously requires much more than just embeddings and a basic chromadb set up. Happy to share any thoughts here or on a call if anyone wants to chat. |
|