|
> Processing/streaming logs to get metrics is a terrible waste of time, energy and money. Spend that producing high quality metrics directly from the apps you are looking after/writing/decomming Yeah nah, but, okay, nah yeah. Generating metrics in the app is much more intrusive, and requires that you figure out the metrics you need ahead of time. It adds dependencies, sockets, and threads to your app. Unless you're very careful, it's also easy to end up double-aggregating, computing medians of medians and other meaningless pseudo-statistics - if you're using the Dropwizard Metrics library, for example, you've already lost. If you output structured log events, where everything is JSON or whatever and there are common schema elements, you can easily pull out the metrics you need, configure new ones on the fly, and retrospectively calculate them if you keep a window of log history. When i've worked on systems with both pre- and post-calculated metrics, the post-calculated metrics were vastly more useful. The huge, virtually showstopping, caveat here is that there is lots of decent, easy-to-use tooling for pre-calculated metrics, and next to nothing for post-calculated metrics. You can drop in some libraries and stand up a couple of servers and have traditional metrics going in a day, with time for a few games of table tennis. You need to build and bodge a terrifying pile of stuff to get post-calculated metrics going. Anyway if there's a VC reading this with twenty million quid burning a hole in their pocket who isn't fussy about investing in companies with absolutely no path to profitability, let me know, and i'll do a startup to fix all this. I'll even put the metrics on the blockchain for you, guaranteed street cred. |
Oh no, never do anything fancy on the client end. yeah thats total trash. Any client that does any kind of aggregating is a massive pain in the arse.
Counters are good enough for 90% of everything you want. You can turn counters into hits per second easily. Plus they are more resistant to time based averaging. If you do your stats correctly, you can even has resetting counters create nice smooth graphs (non negative derivatives are a god send)
> Dropwizard
Yes, this is a library that argues strongly against the use of metrics. From what I recall 1 node of casasndra will output close to 50,000 metrics by default. That is too much.
When a team I worked with were migrating away from splunk to graphite/grafana, they shat out something close to a million metrics. 99.8% were totally useless.
> You need to build and bodge a terrifying pile of stuff to get post-calculated metrics going.
Yes! I think thats my main objection. Its so bloody expensive to do post-hock metrics. you can buy in splunk, but thats horrifically expensive. Or you can use an open source version and loose 4 person years before you even get a graph.