|
|
|
|
|
by Millenis
3705 days ago
|
|
Similar experience here. We put a lot of time and effort into making our monitoring system more highly available than the thing it is monitoring. Not only that, only scaling vertically on a single node doesn't seem like a good design. When you're spread across many cloud providers, some on premise and a bunch of acquisition legacy stuff the polling and firewall opening becomes a blocker. There are ways to poll things and push metrics without opening millions of firewall ports to every security group. Sensu does that quite well, and it scales. I'm not pitching that as the saviour either as that has other trade-offs. However, Prometheus as far as I can see will suffer from the 2 points you make and all of the workarounds in terms of running duplicate identical nodes and federating them also come with various drawbacks. There are quite a few cloud providers who do an awesome job at providing those requirements. I think the only blocker on those is the current pricing model. As soon as somebody does something disruptive in this area I can see a great deal of convergence to it. |
|
For Prometheus at least, we're so efficient that it actually works out okay for the vast majority of users. You'd typically need thousands of instances doing the same thing inside a single datacenter before you get into our (admittedly more involved) horizontal sharding approach.
http://www.robustperception.io/scaling-and-federating-promet... has more information.
> There are ways to poll things and push metrics without opening millions of firewall ports to every security group. Sensu does that quite well, and it scales.
I don't think that's quite a fair comparison. Sensu Just Works when there's no outbound firewall, Prometheus Just Works when there's no inbound firewall. If you add the other direction of firewall for either then things break down.