| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chaotic-good 3335 days ago

1. Each leaf node is a fixed size block that contains compressed values and timestamps. 1000 values is just an example, number of values in one leaf node is variable.

2. Because there is a lot of data-structures. I'm using tree per series. The database can simply store hundreds of thousends of series. Creating WAL per series is not feasible.

3. It maintains a list of roots.

4. One I/O operation per node. You will fetch a leaf node for every ~1000 data points and a superblock for every 32 leaf nodes. It's not as bad as it sounds because you will read data for one series only. To span over 4 levels the series should contain tens of millions of points.

5. Yes. You will need a beefy machine for this with a lot of RAM.

6. Random reads are fast on modern SSDs. It's optimized for SSD (I simply don't have a computer with HDD).

7. It stores only composable aggregations - min, max, count, sum, min/max timestamps.

8. All series names is stored in memory. During the query time this memory is scanned using regexp to find relevant series names and they ids. This is a kind of a temporary solution. It works good enough for the datasets with small cardinality (around 100K series).