| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kccqzy 2256 days ago

Yes and I find it helpful to have both black box monitoring and white box monitoring with my previous experience.

For black box monitoring we just set up a prober that runs periodically and sends requests. It then checks responses to see if they are what is expected. Bonus if you place multiple such probers across the globe and that also exercises your load balancing and tests the geographic replication of your services.

For white box monitoring we instrumented the code itself to export information about events and metrics. For example, application-level things like the metadata of each request and response, response status, time to generate the response, internal errors encountered; system-level things like memory allocation and CPU time for the container; and dependencies like database query times, and the durations and statuses of external requests, etc. We used http://riemann.io/ to collect and process these streams and set up alerts. I find it really powerful to adopt this paradigm where streams of data are exported from your app and processed externally; though getting used to the stream processing mentality could be something extra to learn.