| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jcgrillo 795 days ago

Nice talk. The first (and best!) logs search solution I experienced in my career was simply a gigantic tree of compressed logs on a hadoop cluster. As someone who spent a bunch of time analyzing logs, the "query interface" being "anything you can sling at the hadoop cluster" was phenomenally awesome. The basic computering tools are programming languages, and eventually you encounter problems where you need a real (Turing-complete) one.

One great side effect of this was service developers weren't afraid to write logs. We logged excessively, and it didn't cost too much. If we'd been indexing everything in ES it would have bankrupted us.

These days with S3 and the cloud, hadoop (or the EMR suite) per se probably isn't the way to go, but I'd sure like to see observability solutions giving me a first-class programming model that I as a user can interact with--not some bespoke "query DSL", and for them to accept that instantaneous indexed retrieval isn't important.

This paper is really interesting: https://www.usenix.org/system/files/osdi21-rodrigues.pdf

Stuff like this gives me hope we can have it both ways. With highly tuned compression and programmatic access the user is empowered and the cost is minimized.