Hacker News new | ask | show | jobs
by annanay 1028 days ago
Grafana Tempo also switched from Protobuf storage format to Apache Parquet last year. It's fully open source, and the proposal (from April 2022) is here: https://github.com/grafana/tempo/blob/main/docs/design-propo...

The relevant code for parquet storage backend can be found here: https://github.com/grafana/tempo/tree/main/tempodb/encoding

disclosure: I work for Grafana!

1 comments

Cool thanks for sharing. Can you say something about how it's worked out? Has it reduced bandwidth or CPU usage?
The Parquet backend helped unlock traces search for large clusters (>400MB/s data ingestion) and over longer periods of time (>24h). It also helped unlock TraceQL (a query language for traces similar to PromQL/LogQL). There's more details in this blog post: https://grafana.com/blog/2023/02/01/new-in-grafana-tempo-2.0...

I don't have the exact CPU/bandwidth numbers on me right now but CPU usage has went up by about ~50% on our "Ingester" and "Compactor" components (you can read up about the architecture here - https://grafana.com/docs/tempo/latest/operations/architectur...). But this is optimising for read performance which improved significantly.