Hacker News new | ask | show | jobs
by sz4kerto 3146 days ago
No good migration story for existing data, that'll hurt us quite a lot. :(
5 comments

Yeah, as Prometheus's local storage is meant more as a transient / non-durable metrics store, the only current way to migrate while simultaneously accessing old and new data is to run both the old and new servers and have the new one read old data from the other one via the remote-read integration.

Someone could write a tool to do a full migration of the old storage format to the new one, but the formats are completely different and at least in the naive version of such tooling, that would have to happen offline and take a very long time to run for large storages.

EDIT: If you would like to fund development of such a tool, let us know :)

> EDIT: If you would like to fund development of such a tool, let us know :)

We're too small for that, unfortunately. Someone said at DockerCon that migrating large (multi-TB) stores would take a long time; this doesn't apply to us, we have only ~0.1 TB perf data as of now.

Out of curiosity, do you care about migrating the data online, or would a brief Prometheus downtime (and thus gap in data) be ok?
Downtime would be acceptable I think, especially because we could just launch a separate instance of Prom while the main one is being migrated.
There is a transition feature: https://www.robustperception.io/accessing-data-from-promethe...

The problem with data migration is that the two versions of the system lay out the data quite differently, so converting from one to the other would take a lot of disk seeks. In the worst case you could be looking potentially at days to convert the data over, which isn't really an option for most systems that care about older data.

Seek time is not relevant, our stuff is on SSDs. Thanks for the link, I've known about the transition feature.
First approaches are happening for this now:

https://groups.google.com/forum/#!topic/prometheus-users/wO5...

- https://github.com/Percona-Lab/prom-migrate (requires old Prom server to run for reading out data)

- https://github.com/juliusv/prom-data-migrator (operates offline on old and new storage dirs directly)

Percona just released a migration tool.

https://github.com/Percona-Lab/prom-migrate

If you need your Prometheus data to survive, you're doing it wrong.
What are we doing wrong? We use Prometheus for many things, for example I'd like to know how specific latencies have changed over time. Why should I store these numbers somewhere else?
Prometheus is not intended as durable long term storage, it's fundamentally limited to the size of a machine. You should also design your monitoring be able to tolerate completely losing the data of a Prometheus.

We recommend using another system for long term data, see https://prometheus.io/docs/operating/integrations/#remote-en... for some examples.

The problem is (as you know) that single machines are in practice still too reliable and Prometheus is still too good at storing data for long times that many people have come to rely on it despite warnings :)