Hacker News new | ask | show | jobs
by underlines 635 days ago
We build a corporate RAG for a government entity. What I've learned so far by applying an experimental A/B testing approach to RAG using RAGAS metrics:

- Hybrid Retrieval (semantic + vector) and then LLM based Reranking made no significant change using synthetic eva-questions

- HyDE decreased answer quality and retrieval quality severly when measured with RAGAS using synthetic eval-questions

(we still have to do a RAGAS eval using expert and real user questions)

So yes, hybrid retrieval is always good - that's no news to anyone building production ready or enterprise RAG solutions. But one method doesn't always win. We found semantic search of Azure AI Search being sufficient as a second method, next to vector similarity. Others might find BM25 great, or a fine tuned query post processing SLM. Depends on the use case. Test, test, test.

Next things we're going to try:

- RAPTOR

- SelfRAG

- Agentic RAG

- Query Refinement (expansion and sub-queries)

- GraphRAG

Learning so far:

- Always use a baseline and an experiment to try to refute your null hypothesis using measures like RAGAS or others.

- Use three types of evaluation questions/answers: 1. Expert written q&a, 2. Real user questions (from logs), 3. Synthetic q&a generated from your source documents

2 comments

Could you explain or link to explanations of all of the acronyms you’ve used in your comment?
These are all "techniques" on top of the foundations of RAG. It's similar to "Chain of Thought" in prompt engineering. You have an underlying technology, and then come up with techniques/frameworks on top. What MVC was for Web dev +15 years ago.

RAPTOR for example is a technique that groups and clusters documents together, summarizes them, and creates embeddings defining a sort of a Tree. Paper: https://arxiv.org/html/2401.18059v1

Agentic RAG is creating an agent that can decide to augment "conversations" (or other LLM tools) with RAG searches and analyze its relevance. Pretty useful, but hard to implement right.

You can google the others, they're all more or less these "techniques" to improve an old-fashioned RAG search.

Worth noting that a lot of the improvement gains you get from RAPTOR are (from my use cases) related to giving context to the chunks. Simpler but easier to implement methods of summarizing context (e.g. in a hierarchical document) and cutting chunks around document boundaries can get you most of the way there with less effort (again, as other mentioned, it depends though on your use)
It makes me chuckle a bit to see this kind of request in a tech forum, particularly when discussing advanced LLM-related topics.

This is akin to a HN comment asking someone to search the Internet for something on their behalf, while discussing search engine algorithms!

A lot of people here (myself included) work across different specialisations and are here to learn from discussion that is intentionally unfamiliar.
Yes, but ChatGPT knows these things! Just ask it to expand the acronyms.

This is the new “can you Google that for me?”

يمكن لـ ChatGPT أيضًا الترجمة من العربية إلى الإنجليزية، ولكن سيكون من المزعج استخدامه للمحادثة في هذا السياق
Annyira lusta vagyok, hogy nem akarok néhány gombot megnyomni, ezért kérlek, írj nekem egy oldal szöveget.
Another solution is to downvote / not upvote comments which place an unreasonable burden on the reader. The best comments are those which can be broadly understood without a need for Googling acronyms or "expanding" the comment using an LLM.
It adds useful context to the discussion and spurs further conversation.
HyDE: Hypothetical Document Embeddings [1]

RAGAS: RAG Assessment [2]

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval [3]

Self-RAG: Self-Reflective Retrieval-Augmented Generation [4]

Agentic RAG: Agentic Retrieval-Augmented Generation [5]

GraphRAG: Graph Retrieval-Augmented Generation [6]

[1] https://docs.haystack.deepset.ai/docs/hypothetical-document-...

[2] https://docs.ragas.io/en/stable/

[3] https://arxiv.org/html/2401.18059v1

[4] https://selfrag.github.io

[5] https://langchain-ai.github.io/langgraph/tutorials/rag/langg...

[6] https://www.microsoft.com/en-us/research/blog/graphrag-unloc...

What do you think of HippoRAG? Did you try it or plan to do?