Hacker News new | ask | show | jobs
by devdevdev1 3334 days ago
It seems very much like the B+ tree approach is just a mental model put on top of the exact same idea that is being argued against. The initial list of "bad things about LSM approaches" has almost exactly the same items on it as the list of features the B+ approach claims to achieve.

Maybe I'm getting this all wrong, but aren't the leaves also representing chunked data, which is compressed.

The Prometheus solution also sequentially places compressed chunks for the same series. The time slicing actually has a lot of benefits and can simply be seen as the first level of the described B+ tree. An index of chunks for a series can then be seen as the second level.

The potential read amplification here seems completely equivalent. Just from my high-level view, all properties of the read and write path seem almost identical.

1 comments

>> Maybe I'm getting this all wrong, but aren't the leaves also representing chunked data, which is compressed.

Leaf nodes contain data from one series (this data should be read together) and SSTable with time-series data contains many series and there is no guarantee that all these series will be used by the query.

>> The Prometheus solution also sequentially places compressed chunks for the same series.

I'm not really that familiar with Prometheus internals, especially with indexing part. As I understand it doesn't align writes so there is a lot of write amplification on the lower level that translates to cell degradation and non-optimal performance, but I can be wrong here.