Hacker News new | ask | show | jobs
by derefr 4281 days ago
Thus why your logging library should have a configurable log-level, and the first thing the user-facing log() function should do is check the arguments against the log-level and early-return if the message isn't important enough.

(In languages that support macros, you could also just make debug_log() a macro that gets compiled out on release builds, like assert(). That means a whole bunch of object files are going to be functionally different between debug and release builds, though, which is annoying for a lot of operational reasons.)

My favourite solution by far, though, is the one recently implemented by Elixir: log() is a regular function, but takes a closure instead of a string. That way, you can actually put all the work required to generate the log message (inspecting data structures, doing intermediate lookups, etc.) inside the closure, and it all gets elided when you decrease the log-level.

7 comments

Macros let you do option three, too, by just guarding the body with an if in the generated code. That way you avoid having to use an anonymous function each time. So in Lisp (log level (construct-a-string)) instead of (log level (fn [] (construct-a-string))), but with the same elisions logic.
At least in Java, log levels don't buy you much when you wind up possibly rendering object graphs to strings (via your ohsohelpful custom, recursive tostring implementation), then duping it to concat "made it here" to it, then immediately abandon the mess on the heap for the gc to handle, since the production log level is warn, so your carefully built, highly detailed log message is never used. Repeat every other line or so... In tight loops, call iter.next in log message and on and on...
That's not a java problem, that's an "I don't understand Java logging" problem. The right way:

    log.debug("I created this gigantic object {}",
              myBigUglyObject)
"possibly rendering object graphs to strings"

But that's a problem with how you choose to use logging in your application - not with logging in general or specific logging features in Java.

Logging used without any care can cause lots of problems - but then so can any part of an application. Used with some care it can be a very useful tool for helping to diagnose all kinds of problems.

Some java servers I have had the misfortune of diagnosing were found to be spending the majority of their CPU time (aside from garbage collection of course) formatting exception stack traces.
which is why slf4j has format strings
... or your recursive reflective tostring implementation....
> My favourite solution by far, though, is the one recently implemented by Elixir: log() is a regular function, but takes a closure instead of a string.

This is a pretty fantastic route as far as syntactical simplicity goes.

I wonder if it can have an implications on the generated assembly nonetheless? Could there be extra assembly generated outside the branch for logging level check, which is necessary to bind the variables for the closure?

Similarly, I might suspect that an optimizing compiler that is considering a function for inlining might change its mind for any function where it sees a nontrivial closure like that, without knowing that "trace" level logging nearly never happens.

"Edge cases", to be sure, and the answers are surely different between languages, but interesting to consider nonetheless.

You may know this, but maybe it's worth pointing out that the big savings here is actually just that the function itself can be pretty expensive, since it may stringify some data structure. In eagerly evaluated languages, if you say this:

    log("debug", "I found: " + foo.toString());
you end up doing a lot of work to create that string, only to have it be ignored by the log call. Whereas if you say:

    log("debug", => "I found: " + foo.toString());
any extra assembly you execute (or generate) to make that work is peanuts compared to avoiding the call to toString(). None of that (besides the simple thing of log levels) helps with the OP's problem with IO, but it can make a real difference to your CPU.
> My favourite solution by far, though, is the one recently implemented by Elixir: log() is a regular function, but takes a closure instead of a string.

A pretty common halfway solution (used by Python's logging for instance) is that the logging library takes a format string and its parameters, and does the formatting itself.

Doesn't save intermediate lookups or complex computations if they exist, but for the common case of a complex stringification it can save your bacon.

The alternative is an explicit conditional around the logging call, which is definitely worse than the ability to log a closure.

Monolog does that, you call the level that you want so you can early-return based on what environment/config you've got set up. Quite nice, means I can keep my debug logs around and deploy the same code to production without worrying about it. I think the PSR standard for logging actually specifies this, too.
Enclosed log evaluation is really nice and I'd have some trouble going back. It's older than Elixir, too; Commons.Logging in C# has supported it for a long time and most logging frameworks in Scala take a call-by-name parameter rather than an evaluated arg.
Example in Scala from Play:

https://github.com/playframework/playframework/blob/2.3.x/fr...

The '=>' means call-by-name in Scala, the argument is lazily evaluated.

Clever. I like that (using the closure).

I've never used Elixir but toyed with Erlang many years ago - is it called asynchronously by the logging process in Elixir? If so, I guess you're relying on immutable data structures to make sure you get the correct logging output?

Everything goes through an async message send to a gen_event (error_logger). The gen_event itself can have annoyingly synchronous+serial behavior with respect to its backends, but gen_event handlers aren't supposed to do much heavy-lifting; they mostly just decide whether they want a message, and then forward it to some backend process to do the formatting+writing+etc.

This gives you another neat benefit, nearly for free: if you have, say, application Foo on node A, wanting to log to SyslogSink on node B, then Foo first sends its logs to a gen_event on A, and the SyslogSink handler in that gen_event gets the opportunity to drop messages, using whatever logic SyslogSink likes, before forwarding them over to B (which might be half-way across the Internet.)