Gorilla: A fast, scalable, in-memory time series database (2016)

Y	Hacker News new \| ask \| show \| jobs

	Gorilla: A fast, scalable, in-memory time series database (2016) (blog.acolyer.org)
	25 points by xnorswap 36 days ago

2 comments

x-yl 31 days ago

The simplicity of Gorilla is attractive but for better compression ratios without too much extra compute I'd instead recommend Sprintz: https://github.com/dblalock/sprintz.

The downside is that (a) Sprintz requires the data to be quantised to fixed point integers, usually fine if the data is coming out of a sensor of some sort and (b) the Huffman coding step of Sprintz requires dynamic memory allocation, whilst Gorilla is almost trivially implemented without it.

Also see Chimp, which proposes some small tweaks to Gorilla to improve its performance: https://dl.acm.org/doi/abs/10.14778/3551793.3551852

link

jcgrillo 31 days ago

Sprintz looks interesting, I hadn't heard of it thanks for the link.

link

mgaunard 31 days ago

How does it compare to DuckDB?

link

phrotoma 31 days ago

It doesn't really, except I suppose that both are called "databases". DuckDB is intended for OLAP, while Gorilla is specifically designed for time series data. You would never do something like "INSERT INTO users ..." with Gorilla.

link

tosh 31 days ago

DuckDB also has as-of joins

https://duckdb.org/docs/current/guides/sql_features/asof_joi...

Are there workloads time series databases can do where DuckDB would be a bad fit?

link

HappMacDonald 29 days ago

To me it sounds like a fancier RRD (as produced and manipulated by RRDtool)

link

tosh 31 days ago

afaiu DuckDB doesn't do delta of delta for timestamps

but it can do delta and bitpacking which is also kinda neat

link