|
> 1. The common failure of docs to explain to users why they might choose one thing or another. "If you want to do x.. If you want to do y.." what if I don't know? Observability docs in general struggle with this. So many data sources can emit so many types of metrics in so many formats, and every tool makes this impossible promise of consolidating it all into one space seamlessly. But tools like Grafana pride themselves so much on visualizing _anything_ that they paint themselves into a corner where they can't be prescriptive about common uses or methods without excluding or confusing others. So a lot of the prescriptive answers to "what if I don't know?" gets chucked onto account and support teams of commercial vendors, because the docs can't anticipate every possible context in which an observability tool will get deployed. Each solution ends up being custom tailored and poorly portable to anyone else's, often not even to other customers with the same data sources and goals at the same scale due to wacky labelling differences or legacy requirements or some internal stakeholder demand. More narrowly focused tools don't have as many of these problems, but not many organizations want narrowly focused observability tools. (Lots of _people_ do, but orgs don't want to pay out deals to multiple vendors for what looks like different flavors of the same result. And hey look it's Grafana Cloud or Datadog or whatever, it can do _anything_, so you devs and also bizops and SRE and IT and hey sales wants a dashboard too and so does the company cafeteria, why not, you all can just use this one tool and we just deal with one bill with a volume discount, right? Right??) Smarter tools don't have as many of these problems by papering over the docs limitations by being better able to anticipate or surface connections between data sources, metrics, logs, traces, events, etc., and does so with better interfaces. But especially for high-cardinality data the usability of those tools either seems to fall apart or their companies charge Datadog-sized invoices. |
I was shopping for one after being outside of this field for a while, and they all do the 101 features and the kitchen sink model, which adds onto the complexity. DataDog, Grafana, but also the open source ones like SigNoz itself.
Ages ago it was all about metrics, today it's metrics traces logs APM alerting exceptions and a dozen other acronyms, on top of the protocols (statsd, Prometheus, OpenTelemetry), paired with crazy complicated yet unwieldy graph building UIs. Let's not even talk about pricing models. The entire business model is based around having one more checkmark in the feature list than the competition. The wire format (OpenTelemetry) has never been the pain point in this space.
For a moment, I seriously considered just going back to the 2000s and using RRDtool.