Hacker News new | ask | show | jobs
by londons_explore 1494 days ago
Most applications don't require true random access.

Instead you're typically reading a range of data, and then you can decompress just the blocks required for the data you want to see.

Caching of partial queries can also help substantially. For example, if many queries involve querying the max() of some per-second data grouped by minute, it is well worth caching that rather than reading the source data every time to calculate the max().

Typically the query engine can keep counts of every subquery and how frequently it's used and how many data points it involves to decide how long to cache it for. As far as I'm aware no opensource tsdb does this, despite it being a massive simple win, especially for alerting systems and dashboards that run very similar queries frequently.