Hacker News new | ask | show | jobs
Gorilla: A fast, scalable, in-memory time series database (2016) (blog.acolyer.org)
25 points by xnorswap 36 days ago
2 comments

The simplicity of Gorilla is attractive but for better compression ratios without too much extra compute I'd instead recommend Sprintz: https://github.com/dblalock/sprintz.

The downside is that (a) Sprintz requires the data to be quantised to fixed point integers, usually fine if the data is coming out of a sensor of some sort and (b) the Huffman coding step of Sprintz requires dynamic memory allocation, whilst Gorilla is almost trivially implemented without it.

Also see Chimp, which proposes some small tweaks to Gorilla to improve its performance: https://dl.acm.org/doi/abs/10.14778/3551793.3551852

Sprintz looks interesting, I hadn't heard of it thanks for the link.
How does it compare to DuckDB?
It doesn't really, except I suppose that both are called "databases". DuckDB is intended for OLAP, while Gorilla is specifically designed for time series data. You would never do something like "INSERT INTO users ..." with Gorilla.
DuckDB also has as-of joins

https://duckdb.org/docs/current/guides/sql_features/asof_joi...

Are there workloads time series databases can do where DuckDB would be a bad fit?

To me it sounds like a fancier RRD (as produced and manipulated by RRDtool)
afaiu DuckDB doesn't do delta of delta for timestamps

but it can do delta and bitpacking which is also kinda neat