| HN Mirror

For our scale and request patterns (easily-partitioned / 0.1 qps), no major issues but the JavaScript bindings (which are different to their wasm bindings) that I use leave a lot to be desired. To DuckDB's credit, they seem to have top-notch CPP and Python bindings that even support the efficient memory-mapped Arrow format that's purpose-built for cross-language / cross-process , in addition to being top-notch in-memory representation for Panda-like data-frames.

Granted DuckDB's is in constant development, but it doesn't yet have native cross-version export/import feature (since its developers claim DuckDB hasn't reached maturity to stabilise its on-disk format just yet).

I also keep an eye on https://h2oai.github.io/db-benchmark/ As for Arrow-backed query engines, Pola.rs and DataFusion in particular sound the most exciting to me.

It also remains to be seen how DataBrick's delta.io develops (might come in handy for much much larger data-warehouses).