Hacker News new | ask | show | jobs
by axytol 641 days ago
I'm using VictoriaMetrics (VM) to store basic weather data, like temperature and humidity. My initial setup was based on Prometheus however it seemed very hard to set a high data retention value, default was something like 15 days if I recall correctly.

Since I would actually like to store all recorded values permanently, I could partially achieve this with VM which let me set a higher threshold, like 100 years. Still not 'forever' as I would have liked, but I guess me and my flimsy weather data setup will have other things than the retention threshold to worry about in 100 years.

Would be nice to learn the reason why an infinite threshold is not allowed.

2 comments

Mimir [1] is what we use where I work. We are very happy with it, and we have very long retention. Previously, our Prometheus setup was extremely slow if you went past today, but Mimir partitions the data to make it extremely fast to query even long time periods. We also used Thanos for a while, but Mimir apparently worked better.

[1] https://grafana.com/oss/mimir/

I have done mimir deployments. I am generally very happy with mimir. It's very cost efficient. It does require someone to know enough to admin it though.

I didn't pick thanos because I really like the horizonal scaled blob store architecture the Grafana crew put together.

Yeah. Would be interested to see how VictoriaMetrics compares to Mimir, not just Prometheus.

To be fair many projects in Prometheus "long term store" space come and gone - Thanos, Cortex, M3

Here you go https://victoriametrics.com/blog/mimir-benchmark/ It is from Sep 2022, it would be great to get newer results.
Thanos is well alive and kicking, actually used in some of the biggest infrastructure setups in the world such as Shopify and Cloudflare.
Hi! None of these are "gone" in any way, and are actually used in various different enterprises at global scale.
Can you elaborate a bit more on the come and gone part?
They're out of tech posters' zeitgeist but AFAIK they are each still maintained and fulfilling people's needs. Just not as much commentary or front-of-mind-share.
Usually performance and storage concerns. You can set effectively infinite retention on Prometheus, but after a long enough period you're going to like querying it even less.

Most TSDBs aren't built for use cases like "query hourly/daily data over many years". Many use cases aren't looking further than 30 days because they're focused on things like app or device uptime and performance metrics, and if they are running on longer time frames they're recording (or keeping, or consolidating data to) far fewer data points to keep performance usable.