Hacker News new | ask | show | jobs
by fells 703 days ago
It's always struck me that these are two wildly different concerns though.

Use metrics & SLOs to help diagnose the health of your systems. Derive those directly from logs/traces, keep a sample of the raw data, and now you can point any alert to the sampled data to help go about understanding a client-facing issue.

But, for auditing of a particular transaction, you don't need full indexing of the events? You need a transactional journal for every account/user, likely with a well-defined schema to describe successful changes and failed attempts. Perhaps these come from the same stream of data as the observability tooling, but I can only imagine it must be a much smaller subset of the 100PB that you can avoid doing full inverse indexes on this, because your search pattern is simply answering "what happened to this transaction?"

2 comments

> You need a transactional journal for every account/user, likely with a well-defined schema to describe successful changes and failed attempts.

Sounds like a row in a database to me.

Dumb question, but is that how structured log systems are implemented?

The reality is that when their service delays something they owe us tens to hundreds of thousands of dollars. This is the tool they’re using but if they can’t even get a precise notion of when a specific request arrived at their gateway they’re in trouble.