|
|
|
|
|
by ajainvivek
121 days ago
|
|
Hi HN, I built ReasonDB (Opensource) after spending 3 years building a knowledge intelligence layer at Brainfish (my company). We stitched together vector DBs, graph DBs, and custom RAG pipelines. The constant problem: when search returns the wrong documents, your AI gives wrong answers confidently. Debugging why embeddings didn't surface the right chunk is a black box – we'd fix one case and break three others. ReasonDB takes a different approach. Instead of shredding documents into flat chunks and hoping cosine similarity finds the answer, it preserves document hierarchy as a tree and lets the LLM navigate through it – like a human scanning a table of contents and drilling into the right section. How it works:
- Documents are ingested as hierarchical trees (headings, sections, subsections) with LLM-generated summaries at each node
- When you query, a 4-phase pipeline kicks in: BM25 narrows candidates → tree-grep filters by structure → LLM ranks by summary → parallel beam-search traversal extracts answers
- The LLM visits ~25 nodes out of millions instead of searching a flat vector space It also has RQL, a SQL-like query language with SEARCH (BM25) and REASON (LLM) clauses: SELECT * FROM contracts REASON 'What are the termination conditions?'
Built in Rust (redb, tantivy, axum, tokio). Single binary. ACID-compliant. Supports OpenAI, Anthropic, Gemini, Cohere and Open Source Models. Runs with one Docker command.GitHub Link: https://github.com/reasondb/reasondb Would love feedback and criticism – especially from anyone who's fought the same RAG quality battles. |
|