Hacker News new | ask | show | jobs
by scottlamb 1699 days ago
> When we Wait() on the errgroup, it will cancel its context even when every spawned go routine has returned nil.

I wonder why. It's documented [1] but seems strange to me. It enforces that errgroup is single-use (rather than cyclical), but I might do that by outright panicking if Go is called after Wait returns. Goroutines in the group might launch others in the group, but presumably not after they've finished, so I'd expect after Wait returns that there are no more calls to Go. Unless folks are passing the group to existing goroutines and then those launch more tasks in the group? That seems like an undesirable pattern.

[1] though not on Wait, just on WithContext: https://pkg.go.dev/golang.org/x/sync/errgroup#WithContext

2 comments

The context derived from WithContext is scoped to the lifetime of errgroup (after they all return, or any return with an error). This way the function passed to Group.Go can spawn other go routines that will get properly cleaned up.

Why those other go routines would spawn processes that they don't clean up otherwise? They probably shouldn't, but canceling the context should always be safe, and is more likely to safely handle what otherwise would be a bug, than it is to cause a bug.

> canceling the context should always be safe, and is more likely to safely handle what otherwise would be a bug, than it is to cause a bug.

They gave an example of a bug it caused, and I haven't seen any examples of bugs it would have prevented, so I'm not sure I agree.

I think the contract in the context package is that WithCancel may leak resources if cancel is not called and the parent context is cancelable but never canceled. At least, that seems to be the behavior I see in the implementation here:

https://github.com/golang/go/blob/master/src/context/context...

So in this case, errgroup probably doesn't have a choice and needs to call cancel() at some point.

Ahh. Yeah, that's unfortunate but explains this choice.

btw, I realized I'm not being entirely fair in saying it caused a bug. There would have been one anyway. If conflictsBuilder.Wait() failed, it wouldn't cancel the other two operations. It'd wait them out even though the overall operation was doomed. Maybe not a very noticeable bug under normal conditions but still not right.

Contexts spawn a goroutine to block on the Done channel. You always need to call cancel otherwise you'll leak that goroutine. It's an annoying thing that comes from the fact that contexts are implemented in user space and there's no way to block on a channel without block a goroutine.