Hacker News new | ask | show | jobs
by efazati 1707 days ago
I think emails is not really the best way to manage errors. Maybe something like https://sentry.io/ works better
2 comments

This is one dudes side projects, not an Enterprise K8n deployment - I think email is a great option. Of course when you go full "web scale" you should start using products like Sentry, but we're talking alternatives to "dump an error to stdout" here
Sentry is very easy to self-host[1] on a single node with Docker Compose. They make it a little tricky to self-host with multiple nodes, but I assume their SaaS product can scale indefinitely.

I can't think of anything easier for error-tracking than Sentry, given its ability to automatically intercept exceptions in languages like Python. Sentry also has some automatic handling for stack traces, recording the state of Redis clusters and similar bits of infra, and redacting information that appears to be sensitive (e.g. such as database passwords).

https://github.com/getsentry/onpremise

Developers can carry the practice into a company if they aren't careful.

At my last job, some of the senior/architect developers had baked in code that would email them with various application statuses. They pointed the logs at their main email addresses for maximum visibility. But what ended up happening is that they let their inboxes get so flooded with email that they simply never bothered to check their email again, including any work-related messages. So to solve that problem, a couple of them just set up keyword filters to auto-forward inbox messages to yet another service that they would check on.

Even now, there's some legacy system that emails all of the developers with some error messages. I think only two out of our 20+ development team even knows what they're for.

Further, if you flood your email server, you can miss logs. And if you hand your project off to someone else, you'd have to figure out if you also want to hand over your email account, or if you want to point the logs to their email account.

tl;dr: email is the wrong solution for logging

Thanks for your comment and your experience. I agree that at a large scale it would be silly to receive individual emails for error messages. It would make more sense to have a dashboard with an aggregated view and statistics and everything. Piecing together a story or determining long-term performance by email would be no good.

I would also dread the idea of multiple people logging into a single email account and triaging things without knowing who read what, or everyone getting their own copy of everything and not knowing what needs doing.

But to know that my monthly backups are working or having trouble, this is working well for me so far!

> Thanks for your comment and your experience. I agree that at a large scale it would be silly to receive individual emails for error messages.

At small to medium scale, having a mailing list for the dev team which gets emailed when issues come up can be quite handy. It can't be your whole process - someone still needs to take responsibility for actually fixing problems. And you might need to aggressively rate limit it when errors happen. But for the occasional email it can work quite well. Its much easier than building a dashboard.

Eg "[ops] Monthly backup process FAILED", "[ops] Warning: prod4 at 95% RAM usage"

You could just log to a file and then use logrotate to help manage the old logs, which should already be available on your system. Hopefully your application doesn't get so big that you need to rethink your logging solution.
I had to turn Outlook rules off due to an issue temporarily. I was getting over 1500 notification emails a day that I usually just auto-delete. They just email entire departments and have us setup rules for things we don't care about, rather than manage proper lists so we can subscribe to things we do care about.
I was thinking the same thing! Any logging tool will help take the errors and group them or some kind of housekeeping to help vs 200 emails slamming an inbox.
In my "my_operatornotify" file, which isn't published, I did test a batching mechanism that used a background thread to wait 60 seconds before sending the email, in case more with the same subject line came along. It worked but I decided to stop using it. Most of my programs send their complete logs after they finish, only the long-running daemon programs send individual warning emails right away. I got a laugh out of the YCDL explosion and fixed the issue, for now anyway.

Thanks for the comments!