Hacker News new | ask | show | jobs
by turtledragonfly 1202 days ago
I use GDB almost daily, and used Visual Studio pretty deeply for many years (and still a little bit, nowadays), but I must say I am still a "printf debugging" aficionado (or better: real logging).

I like many of the features that debuggers can provide, and I think this is a good article to set aspirational goals for what is possible. But my lived experience has generally been that it's a buggy, fuzzy, moving target in terms of overall user experience. GDB, bless its heart, is still somewhat a pile-of-bugs itself. I frequently have it crash, or otherwise get into a "so confused it needs to be restarted" state. Visual Studio is much more stable, though less powerful, too — less scriptable, at least.

Perhaps some day, debugger tech^H^H^H^H UX will advance to the point where it really delivers on its promises consistently and solidly. But after 20+ years in software, I am not holding my breath (: There are some situations where a debugger is just the thing you want (eg: hardware breakpoints can be a life-saver), but I find that's more the exception than the rule, at least in the corners of the software world I've worked in.

Compare the above with logging: It is simple and trustworthy. As you get good at designing a solid logging system, and interpreting the results, your life just gets better and better. If you get good at using a debugger, you can still be hit with gnarly weird behaviors, debugger apoplexy, optimized-out-code wackiness, etc. that are hard to control or predict.

Anyway, "debugger vs. logging" are often presented as some sort of either/or choice, and in some sense it is (you only have X time to spend; where would you like to spend it?), but in many senses it is not; both have their strengths. I just find that the cost/benefit for me has generally favored logging and testing, over the years.

9 comments

I think one problem I have with gdb and to a lesser extent lldb is that when using them with an IDE, it's just a janky connection between a command-line program and a GUI program. Back in my macOS 7-9 days, using something like CodeWarrior, or even Vusal C++ on Windows, stepping was so fast and smooth. It would respond instantaneously. I can now press the step button like 5 times and watch as it steps to each line. My machine is a bazillion times more powerful, so I have to wonder if that interface between the UI and the debugger is part of the problem. I realize there's also protected memory and separate processes, and all that. But man it's insane how much faster our machines are and how much worse just single stepping is.
You might want to look into the "TUI" mode in GDB. It's an ncurses-style interface, where it shows you the current code, and current line, and you can step along "visually." It is fast. Press Ctrl-L to re-draw the screen when the display gets messed up.

...

Then, you may notice that TUI mode steals certain keyboard commands, such as up/down arrow to scroll the source listing, rather than navigating command history.

Then, you might enable Vi-mode for GDB's readline, so you can use "j/k" to navigate command history, even in TUI mode. Plus Vi-mode is just better (:

Then, you may find that certain things don't work quite right in Vi-mode, because it's not the default and doesn't get as much testing. But you fuddle along because it's better than the alternative.

And thus you have arrived at my basic situation (:

I shouldn't have to sacrifice either good UX or speed of single stepping on a modern machine. It says a lot that that's even a real suggestion in our industry.
How long before a Lua implementation of gdb for neovim outperforms gdb in Vi-mode?
I learned C++ on CodeWarrior (and then vi) and its debugger made me require a debugger whenever I write code. I am currently using PHPStorm (or any JetBrains) and its XDebug debugger.

For Javascript, I use Brave/Chrome debugger.

...And I should use logs more.

I am always in shock when I see people work on large projects with no debugger. I do printf/console.log all the time but HOW DO PEOPLE NOT STEP-STEP-STEP??

> HOW DO PEOPLE NOT STEP-STEP-STEP

I think I have felt some of what you are feeling. The debugger is ... seductive (: For one thing, it is an educational experience, as you step along you are learning what your program is actually doing, which is a good mental checkpoint, especially when I was a more junior developer. I think a debugger as an educational tool is a big point in its favor. And so you may develop a positive relationship with your debugger, and want to bring it with you wherever you go.

But you may find, as I did, that the sparkle of that relationship fades somewhat in time. You will be better at knowing what the code does, without needing the debugger to tell you. You will hit situations where the debugger cannot help you, or is less helpful than the alternatives. You will get better at structuring your code so certain classes of problems simply don't happen as much, rather than using a debugger to peek-and-poke to fix things all the time. You may find the debugger to be a relatively exhausting high-touch real-time experience, compared to thinking about issues more at your leisure, or getting a larger understanding from other sources.

And so, with experience, and hopefully with an open mind, the debugger will settle into its proper station, in the pantheon of your various debugging aids. Not bad, but not the end-all-be-all either.

I am being somewhat flowery in my language, but all I am saying is: it's okay to rely on the debugger, at some stages. Give it time. Just don't forget about the alternatives or denigrate them needlessly. You want to build a big happy family of technologies and techniques (many of them residing solely in your mind) that can all work together.

“HOW DO PEOPLE NOT STEP-STEP-STEP”.

For me it’s because that is slow, tedious and exhausting. Often you need to cover a lot of code surface area to find a bug.

Worse, the bug may be something reported in production in a giant pile of code.

A good set of logs will go a very long way to at least narrowing down where the bad behavior is coming from, and is generally scalabale.

The debugger is useful, but generally only at the extremes of “new to the code, totally clueless” and “really advanced bug that I need to bring out the big guns for”.

The logging approach clicks in really fast the minute you’re talking about a multiprocess program, being microservices or just plain old SOA.

This reminds me of one time I had a co-worker explaining to me how awesome is the logging system he wrote. I expressed my skepticism about using a complex one-off system for debugging purposes on the basis that the said system itself needs debugging. Then no more than couple of months after that conversation a test breaks. The test switches between two code paths over some condition and emits different message via the said logging system, both messages should be appearing but now there is just one. This implies either condition is not detected or it just does not happen, perhaps a system issue? Could be the new hardware revision we switched to recently is faulty? Nope, the older build of the same test runs fine.

Git bisect to the rescue, what do you know, it breaks on the commit adding another improvement to the logging system. The second message is getting eaten by the logging system, that's all.

I don't think this persuaded the co-worker to stick to the more common tools, but definitely fortified me in my belief that a buggy tool you made yourself is no way better than a buggy tool that thousands use all the time.

I have been programming professionally since 1986 and still nothing beats logging or having chunks of specialized code to do dumps of some data to files so you can analyze them with tools better suited for the purpose. Ideally though you don't want to have to modify the code to diagnose the problem especially if it's a crash caught in the wild and you have a chance to live debug it. I would love more useful visualization tools in the debuggers (mostly VS for me) that would be very helpful in all situations like debugging crash dumps.
Most of the data I work with can't be visualized by printing. (I mostly work on 3D and video.) But I have found it invaluable to log data, then read it into another program. Sometimes I can just bing it into a spreadsheet and plot it, other times I need to write something to display it or analyze it. It's definitely an under-utilized technique!
> It is simple and trustworthy.

This sounds so naive for someone with 20+ years in the field...

Linux, for decades, couldn't get logging to the point that it at least doesn't lose messages (the problem with tail / logrotate that is quite obvious once you think about it, but it took many years to give up the approach).

I recently hit a bug where NVidia's driver abuses Linux kernel logging in some tight loop by spamming log messages at insane speed (happens when you have two video adapters, Intel and NVidia and an external monitor). An interesting side-effect here is that Linux logging tries to throttle loggers who output too much, so, from the log you cannot tell what's happening (because even though the system is burning calories trying to print a tonne of messages, nothing really gets printed).

Several iterations ago I worked on a product where logging had to be implemented as writes to shared memory self-styled circular buffer, and because there was too much info printed too quickly you only had few seconds worth of logs before system crash... on a good day.

Needless to mention the fun of stitching together logs coming from different places in your system with separate clocks.

Even simply processing hundreds of Gigabytes of logs on its own isn't a trivial task.

----

Many things are simple, when your task is simple. Logging is just one of those things.

> Many things are simple, when your task is simple. Logging is just one of those things.

I agree with much of what you said, and of course "logging" is not just a single point in the solution space — there is some function "troubleshooting_pain = f(your_project, your_approach)". I was trying to say that for "your_approach=logging" that function tends to return smaller values than for "your_approach=debugging", all other things being equal, in my experience.

Whereas your comments seem more oriented towards the "your_project" factor. Of course using logs is harder on a distributed system. But so is using a debugger, or just about anything else.

Perhaps I should have said "It is relatively simple and trustworthy, even if it can still get hairy at the extremes."

Both interactive and declarative debuggers work better in distributed systems than logging because they can observe events as they happen, and don't need to recreate the order in which they happened from the records which are very hard to make chronologically consistent.

Things like EBPF (which may implement sort of a declarative debugger) are, perhaps the only tool you may hope to use in high volume and high frequency systems.

If I could only choose one technology used for software diagnostics, I'd choose debuggers over logging. Debuggers need more effort to develop them, and they aren't very good (yet), but they have potential. I don't believe that logging can be substantially improved to deal with difficult problems.

One thing that can be pretty nice which is kinda neither traditional debugging nor logging is DTrace (or similar). Basically event tracing on steroids. Maybe EBPF is in that vein? I don't have much personal experience with it, but I have heard some stories of good success on busy production systems.

I guess my (limited) experiences with distributed systems are different than yours. The notion of "pausing" the system to step through things interactively was usually untenable. Do you stop the one node that shows the issue, and let the others run, getting into who-knows-what shared state? Do you somehow attempt to stop them all, and hope/pray that they all are in the right state to make your cross-node analysis meaningful? This was mostly on Apache Spark, where parallelism was the name of the game. Maybe for some kind of long-running distributed system like Erlang it's a different story.

> Maybe EBPF is in that vein?

Not just that :) It was "inspired by". Well, it's the same idea.

> The notion of "pausing" the system to step through things interactively was usually untenable.

That's not what EBPF would be used for in such a system. You'd write a bit of code that can be loaded into a running program and executed as a particular condition occurs. Like how you can attach some code to evaluate on a breakpoint in many other debuggers.

Calling Visual Studio 'less powerful' than GDB is telling.

When you have a bug that doesn't happen until hours into your program execution, being able to set a breakpoint, edit-and-continue, and set-next-statement are worth their weight in gold. You can solve in the moment what would take literally days, even weeks using a logging approach.

Sure they both have their place but having a debugger is indispensable, and Visual Studio's debugger is the undisputed king.

> undisputed king

Well, I personally do agree it's a better, more solid overall product. And also agreed that edit-and-continue in some highly stateful situation can be ::chef's kiss::

But I think there is room to dispute its kingliness (:

For instance, scripting GDB with Python is quite nice, on occasion.

Actually, just on Windows, WinDBG has some killer-feature functionality of its own.

Or back with GDB, take, for example, this classic post by Jonathan Blow, where there is some good back-n-forth discussion about GDB vs VStudio, with varying opinions:

* https://news.ycombinator.com/item?id=5125078

Plenty of disputers out there, I daresay.

> edit-and-continue

Breaks on enough codebases for me to not be in the habit of relying on it.

My own "killer feature" of VS debugging is being able to open up a crash dump, have VS auto-download matching pdbs from a symbol server, and auto-view the correct source file version/revision thanks to source indexing - without checking out by hand and turning my build stale. Killer ergonomics.

Sometimes I'll resort to windbg/cdb for memory pattern searches, automation, untangling wow-mangled crashdumps, etc. but VS itself is a nice first resort.

I've been wanting that symbol server style workflow for GDB for years. It looks like all the parts are there but I haven't found anyone who has plumbed them together into a complete system yet.
20+ years is... a lot. I do agree with the sentiment though. At first I was printf debugging because didn't know better. Then discovered debuggers and my mind was blown. But when I reached the point where I hit bugs that would magically disappear when running the program through a debugger, I finally understood that there's value in becoming good at both debugging styles.
Debugging issues with multithreaded code can be difficult because you could be looking at race condition that only happens when the code is running at full speed and debugging pausing one or all threads could give you an different experience than the real world.
(Debug) logging can also change the frequency/order/synchronization of threads masking over race conditions.
Yeah, I have fought with this occasionally. It's one of the handful of "gotchas" you need to think of when designing logging, and interpreting its results.

Sometimes you can improve the situation by sticking things in a bigger memory buffer and tightening control of when things get flushed to disk, but there's always that fundamental "observing the system will change it" problem. Similar issues arise with a debugger too, of course.

Yeah I had to debug some code with 5 threads that were supposed to be synchronized through the use of several semaphores that was a bear to pin down.
Debuggers are chronically under-invested in. A lot of the pain points could be fixed just with more money and staff. It's a chicken-and-egg situation --- people don't use debuggers that much, for various reasons, so there isn't the investment, so debuggers don't improve, etc etc.
> Compare the above with logging: It is simple and trustworthy.

Is it though? Add concurrency and it's no longer simple. Have an operation running millions of times a second? Good luck even logging that fast enough. If you start adding logic to print only sometimes, now you get a poor-man's debugger.

Yes, see the other sequence of comments[1] where I clarify that a bit. I didn't mean it as an unalloyed "always maximally simple and trustworty in all cases" claim. Sorry for the ambiguous wording.

Yet for me, in those high-concurrency/high-frequency situations, I feel like a debugger is also a pain to use and trust? In truth, I've had best luck solving those sorts of issues by old-fashioned thinking through it / talking it over with someone / creating hypotheses and testing them sort of analysis. And even for that, logging (or profiling, event tracing, and similar "dump lots of data" approaches) tends to produce better "see the big picture" information than a debugger's laser-like focus. Might come down to one's personality somewhat, too (:

[1] https://news.ycombinator.com/item?id=35101564

Great comment. Since it sounds like your jam, do you have any suggestions for reading on logging best practices in this context?
Hmm, I don't have much in the way of links; but I can brain-dump some of my accrued personal opinions, for what they're worth:

* Relatively early in your project, incorporate a "real" logging system, and start leaning on it. For me, in C++, I have most recently been using 'spdlog'[1] fairly happily. It accumulates data in memory and dumps it from a dedicated logging thread at a configurable frequency. That approach helps avoid logging getting in the way of your main program performance, but it has downsides (see below).

** You want logging to be easy to add wherever you need it. It should be about as easy as adding a literal "printf" statement; minimize the barrier to making use of it.

* Be able to slice/dice your logs by category of information, not just debug/warning/error. I find myself often making special "SYSTEM_XYZ_LOG(...)" macros, for different systems. Then, with a compiler flag I can enable/disable output for different facets of the program. Some of those I might leave on, and others are so specialized (or performance-impacting) that I only enable them when needed.

* It can be nice to have several log files, one per system (eg, in a video game: graphics messages go to one file, physics to another, UI to another, etc). However, it can also be useful to have a "combined" log that shows everything intermingled, so you can see the overall timing of things without having to correlate timestamps across files. Ideally your logging systems supports having both.

* Along with the above, develop an ergonomic way of viewing various logs at once. My approach is pretty simplistic: I have a bunch of XTerm windows, and some of them are displaying logs, sometimes filtered variously (eg: `tail -F foo.log | grep -C5 interesting-pattern`). You could get fancier with tmux or something.

** Aside: the general skill of being able to wrangle lots of data from various text files is a good one to develop — not just for logging. I find this much easier to do in a Unix-style environment than on Windows.

* There is a tradeoff between logging synchronously (eg: writing to STDERR, unbuffered) or accumulate-then-flush (such as a buffered STDOUT, or a logger thread approach like in spdlog). If your program crashes, you might not see the most recent (and thus most relevant!) log messages in the latter approach. I usually have some "write to stderr right friggin' now" function for special cases when I need it. However, if you run your program in a debugger and the crash happens, you might be able to step the debugger to let the logging thread dump out whatever is not-yet-flushed-to-disk. I have had good success with that; I just have to remember to do it.

* When you generate too many logging messages, there is a tradeoff between "flush to disk to free up memory" and "just allocate more memory". If you are generating huge amounts of logs, it is possible you will use too much memory on the system, in the latter approach. But in some cases, the latter approach is faster. I've had good enough luck using the former approach and just using a "plenty big" memory buffer.

* If you are logging across multiple machines, you usually need to correlate events via timestamps. So, make sure your clocks are synchronized, or at the very least be aware of this issue (can be a pain).

* Don't be afraid to allow big sizes for your log files, at least when debugging. Storage is cheap, grep is fast. Depends on the scope of your project, of course.

* There's a difference between the logging for your codebase in general, and logging for very specific debugging purposes. I find it handy to have a special log for the latter case, which generally is empty, but when I am investigating something, I can write to that log and monitor it specially, to cut down on having to sift through a lot of noise. It is just the "debugging the current problem" log, and I delete those log statements once I am done. This is essentially the same as using ad-hoc "printf" statements, but using the real logging system, with the benefits that affords.

* Ideally, your log system should not construct its message unless it is actually wanted. Eg: if your log level is "warnings or worse", then LOG_INFO("foo={}", someValue) should not perform any string building work. This seems fairly common today, but some logging APIs don't get this right.

* In C/C++, logging should go through a macro, so it can be compiled out (or compiled out beyond a given severity level) depending on your build. spdlog supports this, and it is fairly easy to write your own, typically.

* A nice-to-have feature is to be able to only log the first N of a duplicate message, when desired. Sometimes (especially when things go wrong), your program will produce an outrageous amount of the same log message, which just drowns out the useful information and potentially use lots of memory. Some logging systems have explicit support for this concept. You can also roll your own (eg: by adding a timer or counter guarding some logging).

* Another nice-to-have feature is having a notion of pushing/popping scopes for logging (eg: log4j's "NDC" concept). In your code, this would correspond to lexical scopes. In the log statements, it would come across as some sort of "toplevel>outer>inner>" prefix or so. This is one thing I wish spdlog had.

** Aha! In digging up the docs for NDC, I found this[2], which does mention a book for your reading list: "Patterns for Logging Diagnostic Messages" part of the book "Pattern Languages of Program Design 3" edited by Martin et al. I cannot vouch for it.

And, as mentioned variously in this thread, logging is just one tool in the toolbox. Don't forget about performance counters, even traces, writing good tests, the debugger, etc. Good logging requires some up-front cost, but generally worth it, IME.

[1] https://github.com/gabime/spdlog

[2] https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4...

Thanks for taking the time - I really appreciate it and lots to think about. This whole thread has definitely swayed my views on the relative merits.

Always enjoy reading people enthusing about tooling they know well.