| Both VictoriaMetrics and Grafana Mimir perfectly fit for long-term storage for Prometheus data. The difference is in the used data storage types - VictoriaMetrics stores data to persistent disks (aka block storage), while Grafana Mimir stores data to S3-like object storage. Both storage types - block storage and object storage - can be used for long-term storage. They have the following differences in the context of major cloud providers (AWS, GCP, Azure): - Object storage space usually costs 2x-8x less than block storage space. - Object storage has up to 100x highest latency for data access than block storage (hundreds of milliseconds for object storage vs milliseconds for block storage). - Block storage usually has much lower network-related error rate comparing to object storagr. For example, it is quite common practice to retry reading data from object storage on network errors, while block storage-based filesystems are much more reliable for this aspect in major cloud providers. - Cloud providers tend to charge every read operation for object storage, while reading from block storage is free. This point is usually overlooked when estimating costs for block storage vs object storage. Given these differences, block storage usually provides better performance than object storage. Block storage also can cost less than object storage when the stored data is read frequently. VictoriaMetrics is optimized for HDD-based block storage, so there is no need to use more expensive SSD-based block storage in most cases. Additionally, VictoriaMetrics compresses production metrics 2x-10x better than Prometheus-like solutions, which store data to object storage (Thanos, Cortex, Grafana Mimir). This also reduces long-term storage costs. On top of this, enterprise version of VictoriaMetrics can be configured to downsample historical data, so it will take less disk space [1]. [1] https://docs.victoriametrics.com/#downsampling |
I think having "long term storage" on S3 compatible location is a way to go but you need ability to use local storage as cache to queries on recent data or just date range you're working with can be fast.