Hacker News new | ask | show | jobs
by whilo 97 days ago
Hey. Hybrid in which sense?

I have integrated Stratum's columnar indices as a secondary index in the new query engine of https://github.com/replikativ/datahike itself, so for numerical data you will be able to use Datalog/SQL to have combined (OLTP, OLAP, ...) processing. Same for proximum (persistent HNSW vector index) and scriptum (persistent Lucene).

Stratum already can be copy-on-write updated online with better write throughput than purely columnar alternatives (Stratum uses a persistent B-tree over column chunks) as far as I tested. I have not compared it in benchmarks yet though, DuckDB recommends to not update it online for instance. But it depends on the workload, if you do random access writes the columnar layout overhead will still be a slow-down compared to OLTP/Datahike's row/entity-wise indices. Also storing fully variable strings in a column is inefficient, for this you want the entity-wise indices.