| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dzimine 2536 days ago

The dichotomy is real and a reflection of Dev vs Ops dichotomy. DevOps made Dev and Ops collaborate but didn't blend the roles & skills. Ops appreciate logs but require consistent metrics to identify and root-cause the problem. Dev appreciate metrics but require logs to debug and fix the problem. Opinions on what is more important are informed by role and experience; the author makes it clear that as a team, we need both.

> For many of the services I run, I haven't looked at logs in months, because the metrics tell the story. If service is degraded, I can usually correlate it to another downed service, a network failure, or a recent change. No logs needed.

Good point, echoing Brendan Gregg, the author of "USE" method commented:

> The USE Method is based on three metric types and a strategy for approaching a complex system. I find it solves about 80% of server issues with 5% of the effort,” (http://www.brendangregg.com/usemethod.html)

Solving 80% of issues with 5% of effort is commendable; the rest 20% goes to developers where the other 95% of effort is spent debugging and fixing the problem, primarily by reasoning about the logs.

So: - "which of metrics or logs is more important?" is a relative and moot - "can metrics be extracted from logs?" - yes; "is it practical?" - it depends: likely NO for DIY. The fact that ELK is not making it particularly well doesn't mean that other products can't / don't do it. -