Hacker News new | ask | show | jobs
by bumper_crop 1466 days ago
Sorry to say, but these hit close to home for me. A lot of the synchronization paradigms in Go are easy to misuse, but lead the author into thinking it's okay. the WaitGroup one is particularly poignant for me, since the race detector doesn't catch it.

I'll add one other data race goof: atomic.Value. Look at the implementation. Unlike pretty much every other language I've seen, atomic.Value isn't really atomic, since the concrete type can't ever change after being set. This stems from fact that interfaces are two words rather than one, and they can't be (hardware) atomically set. To fix it, Go just documents "hey, don't do that", and then panics if you do.

4 comments

The lack of generics has forced all Go concurrency to be intrusive (i.e. implemented by the person using literally any concurrency), and yeah. It's horrifyingly error-prone in my experience. It means everyone needs to be an expert, and lol, everyone is not an expert.

Generics might save us from the simple, mechanical flaws. Expect to see `Locker<T>` and `Atomic<T>` types cropping up. And unbounded buffered thread-safe queues backing channels. Etc. I'm very, very much looking forward to it.

--- edited to rant more ---

I also really wonder where all these "go makes concurrency a first-class concept" claims come from, because I see it quite a few places, and I feel like it's making some very strong implied claims that absolutely do not exist.

Go has channels and select. That's neat. But on the other hand it has threads... but no thread handles. It has implicit capturing of closures. It has ambiguous value vs pointer semantics. It (style- and ergonomic-wise) encourages field references, which have no way to enforce mutexes or atomics. It has had crippled lock APIs that effectively force use of channels for... I don't know, philosophical reasons?

Go is abnormally dangerous when it comes to concurrency IMO. The race detector does an amazing job helping you discover it, but it's very easy to not use it or not take full advantage of it (i.e. non-parallel tests), and few run their production services with the race detector enabled. Because if they did, it would crash all the time, because there are an absurd amount of races in nearly all of the popular libraries (and in common use of those libraries, because concurrency is not a first-class citizen and you can't tell when it's happening / when it shouldn't happen).

> I also really wonder where all these "go makes concurrency a first-class concept" claims come from,

Given that some of the main architects behind Go had K&R C as background I wouldn't be surprised if "first-class" just meant that the language defines both a memory model and primitives for threading. C had neither until it basically adopted both from C++11.

I mean, this explanation makes sense, but after thinking about it a bit, I don't think those are even relevant.

I'd wager if one removed all notions of concurrency from Rust and only left in the `Send` and `Sync` traits (along with borrows, of course), it seems like Rust would still warrant such statements way more.

OTOH, saying this about Go due to "memory model and threading primitives" sounds a little bit like describing C++ as a language with "first class functions" because there's `operator()`…

That’s certainly how I interpreted it. It never occurred to me to think it meant “thread handles” specifically.
I meant it more in that "first-class" tends to mean "this is a thing that is represented in the language / type system".

Go has first-class functions, because you can make a `var fn func() string` field/variable/argument/etc that holds a reference to a func that returns a string.

Go does not have first-class types, because you can't reference or store a type directly. You can use reflection to pass a reflected thing representing a type, but not the type itself. Generics muddies this somewhat, but I'll argue that falls under "generics", not "first-class types". In contrast, Java has both generics and first-class types, because you can pass `SomeClass` itself as an argument.

---

Node.js arguably has first-class concurrency. It has async/await: you do not have concurrency without those keywords. If they exist, you have potential concurrency. If they do not, you do not. (there may be exceptions here for true thread use, and JS runtimes vary, but you get the idea)

Rust has async/await now, and also has Send/Sync, which gives it a very strong claim to "first-class concurrency".

Go's concurrency constructs have no representation in the type system. They're totally invisible. Channels and select are mostly used with concurrency, but they do not define concurrency, and can be (and are) used synchronously as well.

`go` is a keyword, but I don't see how that's any different than `new Thread(fn)`... except that the Thread has a better claim to first-class-ness, because it returns a value that represents the concurrently-executing thread. If you have a thread reference, you know that concurrency exists. The reverse is not true though.

I would say the notion of something being first-class is that you can manipulate it in the same way as a regular value in the language; in particular, you can use an expression that evaluates to such a thing in all the same ways that you can use a built-in version of that thing. Certainly Java does not have first-class types: you can pass a value that is sort of a representation a flattened form of a type, but you can't use that kind of value in a "new" expression or as a function return type.

AIUI Rust's async/await still has quite a lot of special case support. IMO concurrency is only really first-class in languages like Haskell where you can manipulate async actions the same way as a user-defined type and implement concurrency-related operations in plain old code.

I mean, Go clearly has goroutines as a “first class” concept for some value of “first class”, and (as they are a sort of thread) goroutines are concurrency. This is to say, I don’t think your claim about “must have async/await in order to have first class concurrency” is correct in any formal sense (maybe you’re defining “first class concurrency” as requiring async/await rather than asserting that this is what first-class concurrency means to programmers generally?). I agree though that goroutines have no representation in the type system, but that’s because they aren’t values, so one wouldn’t expect them to have a type or a type system representation. Yet they are very much part of the Go runtime and not a library or a syscall or similar.
By that description though, Go also has first class types. And that kind of makes the distinction meaningless because essentially every programming language has types.

There might be room to claim first-class support for green threads? But if so it's a very weak "first class" since all you can do is start them.

Go does not have threads but something like "tasks". The fact that no thread handle is exposed allows for transparently moving these tasks across threads if the scheduler decides so.

"go makes concurrency a first-class concept" I think it usually refers to goroutines being built in the language.

"Go is abnormally dangerous when it comes to concurrency IMO". Personnally, it has not been my experience with Go concurrency. However I have hit some issues when trying to ocrhestrate tasks via channels and ended up resorting to atomics to do the job.

> Go does not have threads but something like "tasks". The fact that no thread handle is exposed allows for transparently moving these tasks across threads if the scheduler decides so.

This doesn't stop there being "task handles" then, though? I think the point GP was making is that something that in most languages would be simple methods on a handle like "wait for this task to finish" or "stop this task" instead need to be done manually in Go with channels (or potentially `Context` in the latter case, although that was a later addition to the standard library). It doesn't really matter whether you call it a thread or a task; either way, it would be nice to get some return value from spawning some background operation and being able to use it to directly interact with it. I agree with GP that it does seem like an odd omission, since I haven't really heard any actual practical explanation for it.

Context for cancellation and replacing thread-local variables (or indeed any way to observe your "current" thread) is one of the things I like tbh. Though Context has abysmal performance implications.

But yeah, I want a goroutine handle with a "Wait()" method. Ideally also returning the results. Like most languages. It'd eliminate a ton of manual mutex and channel use that doesn't need to exist.

---

Re thread vs tasks: that's an implementation detail. You write threaded code and it runs in multiple threads with thread-like memory behavior. In all in-Go observable ways it's identical to threads, and it could be changed to use real hardware threads tomorrow and none of the semantics would change at all. Even cgo would stay the same.

Go has (green) threads. Being more specific is relevant for runtime implementation spelunking and performance details, but not otherwise.

Yeah, I generally think of the word "thread" as referring to OS threads and/or "green" threads depending on context (and in this case I thought it was clear what you were referring to!), but since the person who responded to you made the distinction, I figured I'd use their terminology when explaining what I thought you were saying.
I was just leveraging your already-top reply to reply to both of you, sorry about that :) I should've just done two comments. I think you and I are on the same page here.

I think the main reason it doesn't exist is that go had no generics. It'd need to be another custom-generic type (Future[T] basically), and it would make it harder to pass around, just like channels. But since channels are generally intrusively-added, they aren't part of the return signature, so they avoid that generic-return issue. E.g. every "worker pool" accepts a `func()` and callers need to coordinate return values via channels, instead of needing to return a `func[T]()` reference which they have been unable to do until recently (to some degree at least).

Though they probably could've just said "use a Future[interface{}]", like they did for every other generic collection type.

Plus it'd take some of the emphasis off channels, and they seem to really not want to do that. If they were focused on usability instead of channels and select, they'd let us park on multiple mutexes just like channels, just like the runtime does internally a lot to implement all this... but no. Imagine a world where you could `select { case mut.Lock(): ...}`...

My (rather horrid) pattern to address this problem is to wrap the goroutine in a function that returns a channel receiver. When the goroutine ends it sends something to the channel and whatever called it can await the result or completion using the receiver.
I have, on occasion, used a similar pattern, but instead of sending something, I simply close the channel (usually with a "defer close(c)" at the beginning of the function/closure that encompasses the main code of the goroutine's work).

That way, if I end up having multiple waiters, they will all be able to proceed.

I've always thought it would be nice if the go command returned an ID. Doing so would also be completely backwards compatible, of course. Then add a library or few builtins to do things on that ID, at minimum maybe kill it, perhaps get status of it, etc. Maybe not full blown actor model, but having nothing feels powerless.
Go has OS threads and “green threads” (named “go processes”). You create green threads via the go keyword and the Go runtime assigns that to an OS thread. You can have many go processes to a single OS thread and typically have a maximum of 1 OS thread per CPU core (though that is configurable).

The GP is correct that you cannot manage go processes from outside of that green thread. With (for example) POSIX threads, which still leaves a lot to be desired, you can at least manage the thread from other threads.

Go definitely has some rough edges around threading. The idea is you’re supposed to use channels for everything but in my experience channels have so many edge cases for subtle ways to completely lock up your application that it’s often easier to fallback to the classic mutex-style idioms.

I do really like the go keyword, it’s handy. But I have a background in POSIX threads so probably find concurrency in Go easier than most yet even I have to concede that Go under-delivered on its concurrency promises.

"because there are an absurd amount of races in nearly all of the popular libraries"

This is fud, I ran the race detector with a lot of popular lib and I never found issues like that.

But since you're claiming there are issues everywhere, do you have examples?

I'd say there's an excellent chance your assumptions about types you got from "popular libraries" is more conservative and that's why you never detected any issues.

For example take the JSON decoder. If you have several tasks which can use some data from a JSON blob in parallel, is it OK if they all just share the same JSON decoder?

If you're horrified because this seems obviously like a bad idea, that'll be why you didn't find any trouble. In some other systems your programs would be needlessly slow and clunky as a result, but in Go your assumptions were appropriate.

It seems Groxx expects in this case that either the JSON decoder would work fine used this way, or, the documentation would highlight that you can't do this. Go chooses neither.

Here's Brad Fitzpatrick:

"The assumption when unstated is that things are not safe for concurrent use, that zero values are not usable, that implementations implement interfaces faithfully, and that only one return values is non-zero and meaningful."

These are some pretty important assumptions, or to look at it another way, potential foot guns.

Replying here to the two siblings comments being confused about the decoder example.

What tialaramex is saying, is that if you have a stream of JSON values, you create a JSON decoder over it. Then every time you call the decode() method, you get the next decoded JSON value.

Then you want to process the JSON values concurrently.

Rephrased, the question was what would happen if you were to have every concurrent task call the decode() method whenever it wants a new value to work on?

It would probably be a data race cluster fuck. But you might find this type of mistakes everywhere in Go. I myself fought things like that in many libraries.

One such occurrence I recall was in the Google Cloud Pub Sub client library. It basically did something similar to this example. Trying to offer concurrency over a stream of messages. It would fail very rarely. And pretty much always passe the race detector. It wasn't fun to debug.

A json.Decoder holds on to a single io.Reader, using it concurrently to decode multiple things is just plain old absurd. How would that even work?

https://pkg.go.dev/encoding/json#NewDecoder

> How would that even work?

The same way it works to do so sequentially?

Why would you share a decoder? It makes no sense since you need to decode just once.
You can decode multiple values from a stream.

In a language which focuses on concurrency correctness, the decoder would either be thread-safe (in which case you could use it as an input queue) or not be usable from multiple threads (in which case you’d clearly have to create the queue yourself).

On atomics in Go, the beta for Go 1.19 was released an hour ago (https://groups.google.com/g/golang-announce/c/SNruPJUSFz0?pl...).

> The sync/atomic package defines new atomic types Bool, Int32, Int64, Uint32, Uint64, Uintptr, and Pointer. These types hide the underlying values so that all accesses are forced to use the atomic APIs. Pointer also avoids the need to convert to unsafe.Pointer at call sites. Int64 and Uint64 are automatically aligned to 64-bit boundaries in structs and allocated data, even on 32-bit systems.

Go 1.19 is expected to release in August.

> atomic.Value isn't really atomic, since the concrete type can't ever change after being set.

How does this mean it's non-atomic? As far as I know you can still never Load() a partial Store(). (Also, even if it was possible, this would never be a good idea...)

That's why I opened with "Look at the implementation". Go is unable to store the type and the pointer at the same time, so it warps what "atomic" means. Pretty much every other language has atomic mean "one of these will win, one will lose". Go says "one will win, one will panic and destroy the goroutine.

In fact, it's even worse than that. If the Store() caller goes to sleep between setting the type and storing the pointer, it causes every Goroutine that calls Load() to block. They can't make forward progress if the store caller hangs.

> If the Store() caller goes to sleep between setting the type and storing the pointer, it causes every Goroutine that calls Load() to block.

Where does this go to sleep: https://cs.opensource.google/go/go/+/refs/tags/go1.18.3:src/...

It looks like a CAS busy loop with preemption disabled, to me.

Make sure to read between the lines. It only looks like a busy loop. Remember, the OS can pause and preempt your thread at any time. This is a real and likely event.
By reading the lines and not between them, you could read these two lines: runtime_procPin() and runtime_procUnpin(). With explicit comments that these pause preemption.
> Pretty much every other language has atomic mean "one of these will win, one will lose"

Could you elaborate how "much every other language" implement it?

This is why all the examples call Store immediately with a zero value of the type.
https://go.dev/play/p/xolc9oPwA0C

Interfaces don't have a zero type, which means that we can't have an atomic.Value which stores Shape. Atomic Value would be much easier to reason about if it had store semantics similar to a regular `var foo Shape = ...`. One of the other comment threads talked about generics helping this, so maybe there is hope.

Parent means

    var bestShape atomic.Value
    bestShape.Store((*Circle)(nil))
Which will store it as a *Circle, and only allow more *Circles, not Shapes. That part of GP’s claim is correct.

It just had nothing to do with atomicity; it means something specific, not just “I like the failure mode.”

Abend is a fairly normal and in many ways best way to "lose" in a race. It's fine, it's atomic.
This is Atomic*

* Just don’t be an idiot. Worse is better.