Hacker News new | ask | show | jobs
by aprdm 2151 days ago
We don't care so much about resiliency, we do backup the prometheus folder using the snapshot api.

There are a couple of articles about sharding and federation with prometheus, dunno if they existed when you tried it.

For us our problems are usually local to a datacenter. Having a dropdown where you can pick the datacenter has proven good enough. It is unlikely that we have a global issue in a service.

Sorry if unclear but we have our own datacenters, our prometheus VMs are essentially free in the grand scheme of things considering the number of compute we have.