Hacker News new | ask | show | jobs
by jwineinger 1079 days ago
I see that this uses a Redis backend. Does that mean every metric operation goes to network? Are those synchronous/blocking calls, and are they pipelined with redis? If not, it seems like this could be quite detrimental to app response times if it does many metric ops when handling a single request.

The other thing I noticed is the redis backend has a 1 hour TTL. Does that exist to clean up once a metric is no longer getting touched (presumably because a worker/pod is gone)?

1 comments

Correct, for multiprocessing support it uses Redis, the idea is that you have a sidecar with your service that will sync metrics between all of them for scrapes.

The default are blocking operations and they are fast, they are pipelined for scrapes (retrieving all the metrics value is the slowest part sped-up with this).

If you are curious there are some benchmarks tests that I used to make sure the library was correct comparing with the official one, in my tests this approacher is faster than the mmap files one but they are testing a really limited scenario so take it with a grain of salt: https://github.com/Llandy3d/pytheus-bench#the-results-

Correct for the TTL, if a metric doesn't get scraped/incremented for more than ttl, it will be cleaned up. 1 hour is the current default value.

To go a little further into details, the Rust based backend has a separate thread for changing metrics value, so on your python code, wether sync or async, the operation is extremely fast. On the Rust side, the operations are collected and pipelined together asynchronously. From the tests I've made this is enough, but if someone has an insane amount of metrics to modify, it is possible to add support for multiple "writer" threads, the important bit is that operations for a single metric are done in order, but this can be easily achieved by hashing with the number of threads. I hope this answer your questions!