Hacker News new | ask | show | jobs
by DannyBee 4078 days ago
What do you do for inlining + code movement, where functions become interleaved (as do lines)?

In particular, after inlining, how are you guaranteeing statements from a given function don't get moved before the inlined enterfunction call (and similarly with lines)

Or do you not expect it to ever be able to report right answers for optimized programs? (which is a valid way to live life, of course, but ...)

2 comments

Since it's modifying the source before compiling it, I expect that the compiler will conclude that most optimizations can't be applied when they cross breakpoint boundaries.

So, when debugging is turned on, the code would return the right answers, but it wouldn't have the same performance.

"Since it's modifying the source before compiling it, I expect that the compiler will conclude that most optimizations can't be applied when they cross breakpoint boundaries."

While true, this depends on the compiler knowing this is a magical breakpoint barrier it can't move things across. The compiler has no idea this is a magical barrier unless something has told it it's a magical barrier. Looking at godebug library, i don't see this being the case, it looks like it translates into an atomic store and an atomic load to some variables, and then a function call, which the compiler is definitely not going to see as a "nothing can move across" barrier.

(Also, having the debugging library alter the semantics of the program is 100% guaranteed to lead to bugs that are not visible when using the library, etc)

I think you're right that it's going to introduce bugs in concurrent code. For example, it's legal to send a pointer through a channel as a way of transferring ownership and never access the object again. If the debugger rewrites the code so that "never accesses it again" is no longer true, it's created a data race.

On the other hand, godebug generates straightforward single-threaded code that creates pointers to locals in a shadow data structure and accesses them later. There's no reason it shouldn't work if you're not using goroutines.

In particular, a previous call to godebug.Declare("x", &x) will add a pointer to what was previously a local variable to a data structure. This effectively moves all locals to a heap representation of the goroutine's stack, to be accessed later. It's going to kill performance, but it's legal to do.

"It's going to kill performance, but it's legal to do."

Sure, but it's going to cause the optimizer to do different things than it would have to that variable. As I said, this essentially changes what the compiler is allowed to do, and will expose or hide bugs (usually hide if it hurts the optimizer) :)

One only has to look at the bugzilla's of gcc and llvm to discover all sorts of fun things that barriers hide/expose.

It's clearly not useful for tracking down compiler bugs, but might still be useful for more ordinary bugs in single-threaded user code.
"Also, having the debugging library alter the semantics of the program is 100% guaranteed to lead to bugs that are not visible when using the library, etc"

Can you give an example of the kind of bug you expect to see?

Sure, i'll stick to bugs invisible with the library, and visible without it:

If you've inserted compiler barriers it can't move code across, the compiler will no longer perform the same optimizations with and without your debug library.

Those optimizations often make bugs visible, because variables no longer have the value you expect them to at the time you expect it, etc.

Add in threads, and the problem gets worse:

http://preshing.com/20120515/memory-reordering-caught-in-the...

Look at the behavior this has with and without the barrier. It's buggy without it. With it, it's fine.

This is exactly equivalent to:

Without the debugging library, the code is clearly buggy and doesn't work in user-visible ways.

With the debugging library, if i insert a breakpoint where the current asm barrier is, it now behaves correctly 100% of the time, even though it's broken.

say you have:

    x = 1
    x = 2
    bar()
You set a breakpoint on the call to bar and examine the value of x. You would expect it to be 2, but what if the compiler had decided to move the allocation of x = 2 to after the call to bar? There's no reason why it shouldn't. You'd then see x = 1, which would confuse you.
Thanks for the concrete example. The transformed source code that would get compiled for this example looks like:

  scope.Declare("x", &x)
  godebug.Line(ctx, scope, 3)
  x = 1
  godebug.Line(ctx, scope, 4)
  x = 2
  godebug.Line(ctx, scope, 5)
  bar()
The value of x is visible to all of the godebug.Line calls, so the compiler should know that it can't move x = 2 to after the call to bar.
Right, but now you have the opposite problem. Let's say that before the compiler could move x=2 after bar (let's assume bar does not touch x).

Now, in your world, it can't.

So before, you would have seen x=1 in the call to bar, and now when you use the debug library, you will see x=2.

Good question. I think https://news.ycombinator.com/item?id=9411158 is right, but I haven't looked at it closely yet. Thanks for bringing it up!