Hacker News new | ask | show | jobs
by softawre 3521 days ago
And keep track of that state across 20 different instances?

What we do is just log the failure and have a system like New Relic monitoring everything so that it can alert us when we hit 20% network failure.

1 comments

Sure - but then the developer-facing "log" is the New Relic interface, and your instances transmit failure information to it via some API (I mean I suppose you could have one program output a plain-text log file and then another program or service parse that to figure out how many errors were happening, but you wouldn't do that for any other kind of inter-system communication).