| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ajhai 1048 days ago

Kudos to the team for a very detailed notebook going into things like pipeline evaluation wrt performance and costs etc. Even if we ignore the framework specific bits, it is a great guide to follow when building RAG systems in production.

We have been building RAG systems in production for a few months and have been tinkering with different strategies to get the most performance out of these pipelines. As others have pointed out, vector database may not be the right strategy for every problem. Similarly there are things like lost in the middle problems (https://arxiv.org/abs/2307.03172) that one may have to deal with. We put together our learnings building and optimizing these pipelines in a post at https://llmstack.ai/blog/retrieval-augmented-generation.

https://github.com/trypromptly/LLMStack is a low-code platform we open-sourced recently that ships these RAG pipelines out of the box with some app templates if anyone wants to try them out.