Hacker News new | ask | show | jobs
by aseipp 1106 days ago
Continuous performance monitoring of a service, from its inception. I'm building a storage service using SeaweedFS and also a web UI for another project. One thing I'm looking at doing is using k6[1] in order to do performance stress testing of API endpoints and web frontends on a continuous basis under various conditions.[2] For example, I'm trying to lean hard into using R2/S3 for storage offload, so my question is: "What does it look like when Seaweed offloads a local volume chunk to S3 aggressively, and what is the impact of that in a 90/10 hot/cold split on objects?" Maybe 90/10 storage splits are too aggressive or optimistic to hit a specific number. Every so often -- maybe every day at certain points, or a bigger global test once a week -- you run k6 against all these endpoints, record the results, and shuffle them into Prometheus so you can see if things get noticeably worse for the user. Test login flows under bad conditions, when objects they request are really cold or large paginations occur, etc.

You can run numbers manually but I think designing for it up front is really important to keep performance targets on lock. That's where Prometheus and Grafana come in. And I think looking at performance numbers is a really good way to help understand systems dynamics and helps you ask why something is hitting some threshold. On the other hand, there are so many tools and they're often fun to play with, it's easy to get carried away. There's also a pretty reasonable amount of complexity involved in setting it up, so it's also easy to just say fuck it a lot of times and respond to issues on demand instead.

[1] http://k6.io/, it's also a Grafana project.

[2] It can test both normal REST endpoints but also browsers thanks to the use of headless chrome/chromium! So you can actually look at first paint latency and things like that too.