Hacker News new | ask | show | jobs
by tkone 798 days ago
If you’re debugging something simple or non-distributed, this product isn’t for you.

If you’re working on anything distributed, log aggregation becomes a must. But, also, if you’re working on anything distributed and you’re looking at logs, you’re desperate. Distributed traces are so much higher quality.

3 comments

When I formed these opinions I was working on Materialize, which is basically the polar opposite of "simple and non-distributed". However it was still quite common that I knew exactly which process was doing something weird and unexpected.
Maybe it’s the difference between tracking a bug (abnormal operation) vs understanding behavior of a complex system (normal operation)?
Yup and the reason no one markets something like "tail the logs for server X" is because, if you're talking in the context of an individual server, you're too small for anyone to care about.
I've got logs from hundreds of servers that I use standard tools to look at, and that's a small system. Centralising logs has been a thing for decades.
Which is fine, I'm just saying you're not the target market for the big observability vendors.

The current generation of observability tools is built for distributed systems that are basically too complex to reason about, and so you have other ways of monitoring and debugging them. When you have 10's of k's of ephemeral containers running hundreds of services, you can't just look at some logs for a server to understand what's going on (ignoring the fact that servers aren't even a primitive in this system).

10's of GBs of logs a day just doesn't move the needle on pricing. They want the customers that are going to generate 7 figures in revenue and those customers aren't talking about aggregating logs from a few hundred servers.

Sorry, did plenty of "distributed" tracing back in the day and this is just not the case. I can't help but feel like you're after-the-fact rationalizing as if you need this for diagnosing anything "distributed" or "complicated".

Distributed anything is actually easier in most cases because you will always have input and output. Sure, if you're debugging a complicated and coordinated "dance" between two concurrent threads/processes then yeah fully agreed, but then you're deep in uncharted territory and you need all the help you can get.