Hacker News new | ask | show | jobs
by jcranmer 1202 days ago
I've debugged core dumps before. It's not been a particularly pleasant experience--good luck trying to do something like `call V->dump()` (dump out an easy-to-understand representation of a complex value to stdout... oh that doesn't exist anymore, can't use that functionality!)

The most useful aspect of time-travel debugging for me, personally, has been when the test case that causes a crash is refusing to be reduced, and the function that crashes does so on like the 453rd time it's called. Jumping straight to the crash, then reverse-continuing to a break point cuts out so much time of debugging (especially because it saves you if you accidentally continue the breakpoint one too many times; otherwise, you'd have to start the entire, tedious process from the beginning).

1 comments

Something else that time-travel debugging helps a lot with, is that in an awful lot of cases, what you have in the crash dump is a broken state, with no way to identify how that state happened to be. Like "why the hell does this variable have this value?". Sure, you can do the whole digging work to find what can possibly change that variable, and try different scenarios to hit them in a debugger at the moment the problem might appear in a new session that is not even guaranteed to produce the same crash. But with record-and-replay type things, you just set a watchpoint on the value, continue backwards, and there you go, you find where the value comes from.

Now, imagine you did do that manually and spent a lot of time finding that location. In many cases, that only gets you one step closer to the root cause, and you have to repeat the operation multiple times. Yes, you _can_ do that without record-and-replay, but do you really want to? Do you want to spend hours doing something that could take you minutes?

(And that's not even mentioning even worse cases, where the value you're tracking goes between processes via IPC)