| We built BMP, a fast and memory-efficient search engine for learned sparse retrieval — written in Rust and with Python bindings. It supports exhaustive (non-approximate) search over large collections like MS MARCO, without dropping query terms or pruning the index. Features: - Full support for SPLADE, uniCOIL, CSV, and similar models - No static pruning – keeps full index fidelity - No term dropping – every token counts - Runs fast thanks to block-max pruning - Usable from Python - Pre-built indexes available from CIFF-Hub: https://github.com/pisa-engine/ciff-hub/ Backed by the paper:
Faster Learned Sparse Retrieval with Block-Max Pruning (SIGIR 2024) - https://arxiv.org/pdf/2405.01117 Would love feedback, issues, or contributions! |