Hacker News new | ask | show | jobs
by mattmanser 2082 days ago
Yeah, at moderate volumes (or low depending on your perspective) there's no problem putting it straight into a table in the DB.

If it ever causes a problem, you can move it.

You shouldn't need to order by timestamp, it should already be ordered by the very nature of logs.

1 comments

I completely agree. Because I think this area is really interesting I'm going to add a few more thoughts.

The first web service I built used a table inside the production database for structured logging and I had one kind of log event that I just included more and more in over time, using empty fields when needed, which ends up being an anti-pattern like a god class. That wasn't terrible, I could filter for the logs I wanted by including a statement like RequestId IS NULL.

The biggest problem I have seen with using a table in the DB and not a completely separate DB, is if it's a traditional relational DB (I'm sort of making up that term to refer to systems like MS SQL Server, PostgreSQL, MySQL) the schema is static, and each table can only contain one type of log or event unless you stuff it all into a dynamic-ish column type like JSON or XML (ewww using XML as a logging format).

If you have multiple log formats or types (with the type system meaning of types) you need to use multiple tables to keep the filtering easy. An unintended bonus is that joining across tables in a relational DB is far faster than joining across events types in the other logging systems I have worked with.

I guess I'm trying to say you should consider using a separate DB once you have more than a few log types (which are different tables in the prod DB), which still 100% backs up what you said.

Start off in the prod DB and move it if and when you need to. In the worst case scenario you can do unstructured logging with like 4 common fields (LogCreationTime, LogIngestionTime, RequestID, HostName) and a message field and use that until it doesn't work anymore, and then you have a service that is complex enough to warrant spending more time on the logging.