Hacker News new | ask | show | jobs
by j_kao 309 days ago
Author here! We were really motivated to turn a "distributed system" problem into a "monolithic system" from an operations perspective and felt this was achievable with current hardware, which is why we went with in-process, embedded storage systems like RocksDB and Tantivy.

Memory-mapping lets us get pretty far, even with global coverage. We are always able to add more RAM, especially since we're running in the cloud.

Backfills and data updates are also trivial and can be performed in an "immutable" way without having to reason about what's currently in ES/Mongo, we just re-index everything with the same binary in a separate node and ship the final assets to S3.

1 comments

Why not just use a open source solution like paradedb ... .

Paradedb = postgres pg_search plugin (the base is tantivy). Need anything else like vectors or whatever, get the plugins for postgres.

The only thing your missing is a LSM solution like RocksDB. See Orioledb what is supposed to become a plugin storage engine for postgres but not yet out of beta.

Feels like people reinvent the wheel very often.

What was your experience like putting such thing together?