Hacker News new | ask | show | jobs
by backslash_16 2082 days ago
Really easy and good homegrown logging. Use a structured logging library, this is key because it lets you easily filter on the properties attached to a log event.

Then log those to a database. If you don't have a ton of volume and want to use really reliable tech and be safe(ish) send your logs to a relational DB used only for logging. I have never tried sending to just a separate table within the production database but I know that can be done too.

Create read-only users on that logging DB or table.

Download a GUI client for that DB and you're up and running. You can use SQL to filter, order by timestamp, and create views.

If you still want to stay homegrown it's straightforward to put an API in front of it that powers some dashboards or easy investigation tools. For example, submit correlation/request ID and get back every log event in-order for that, color error logs red and boom - 1st level data visualization for investigations done too.

1 comments

Yeah, at moderate volumes (or low depending on your perspective) there's no problem putting it straight into a table in the DB.

If it ever causes a problem, you can move it.

You shouldn't need to order by timestamp, it should already be ordered by the very nature of logs.

I completely agree. Because I think this area is really interesting I'm going to add a few more thoughts.

The first web service I built used a table inside the production database for structured logging and I had one kind of log event that I just included more and more in over time, using empty fields when needed, which ends up being an anti-pattern like a god class. That wasn't terrible, I could filter for the logs I wanted by including a statement like RequestId IS NULL.

The biggest problem I have seen with using a table in the DB and not a completely separate DB, is if it's a traditional relational DB (I'm sort of making up that term to refer to systems like MS SQL Server, PostgreSQL, MySQL) the schema is static, and each table can only contain one type of log or event unless you stuff it all into a dynamic-ish column type like JSON or XML (ewww using XML as a logging format).

If you have multiple log formats or types (with the type system meaning of types) you need to use multiple tables to keep the filtering easy. An unintended bonus is that joining across tables in a relational DB is far faster than joining across events types in the other logging systems I have worked with.

I guess I'm trying to say you should consider using a separate DB once you have more than a few log types (which are different tables in the prod DB), which still 100% backs up what you said.

Start off in the prod DB and move it if and when you need to. In the worst case scenario you can do unstructured logging with like 4 common fields (LogCreationTime, LogIngestionTime, RequestID, HostName) and a message field and use that until it doesn't work anymore, and then you have a service that is complex enough to warrant spending more time on the logging.