|
|
|
|
|
by natdempk
2706 days ago
|
|
jerf answered observability well in another reply to this comment. As for reliability, monitoring, and error handling I've heard good things about the Google SRE book: https://landing.google.com/sre/books/ I haven't read it personally, but I've heard good things from others and looking over it briefly the advice there lines up with what I've experienced in practice. |
|