Hacker News new | ask | show | jobs
by ajainvivek 102 days ago
Benchmark

We ran a small benchmark on a real-world insurance corpus: • 4 policy documents • ~1,900 hierarchical nodes • 100 queries across 6 complexity tiers

Comparing ReasonDB to a typical RAG pipeline (LangChain / LlamaIndex defaults):

Metric ReasonDB Typical RAG Pass rate 100% (12/12) 55–70% Context recall 90% avg 60–75% Median latency 6.1 s 15–45 s

The key difference is that ReasonDB performs BM25 candidate selection + LLM-guided traversal, rather than flat chunk similarity.

Example reasoning case

One query asked:

“What conditions define recurrent disability?”

The answer was split across two sections: • disability definition clause • policy schedule clause

Flat chunk retrieval returned only the first section.

ReasonDB followed the cross-reference extracted during ingestion, which raised recall from 67% → 100%.