|
I'm torn. As far as practicality goes I actually agree with you: if I knew I were trying to do something to the order of 1,000,000 tasks in Go I would probably use a worker pool for this exact reason. I have done this pattern in Go. It is certainly not unidiomatic. However, it also isn't the obvious way to do 1,000,000 things concurrently in Go. The obvious way to do 1,000,000 things concurrently in Go is to do a for loop and launch a Goroutine for each thing. It is the native unit of task. It is very tightly tied to how I/O works in Go. If you are trying to do something like a web server, then the calculus changes a lot. In Go, due to the way I/O works, you really can't do much but have a goroutine or two per connection. However, on the other hand, the overhead that goroutines imply starts to look a lot smaller once you put real workloads on each of the millions of tasks. This benchmark really does tell you something about the performance and overhead of the Go programming language, but it won't necessarily translate to production workloads the way that it seems like it will. In real workloads where the tasks themselves are usually a lot heavier than the constant cost per task, I actually suspect other issues with Go are likely to crop up first (especially in performance critical contexts, latency.) So realistically, it would probably be a bad idea to extrapolate from a benchmark this synthetic to try to determine anything about real world workloads. Ultimately though, for whatever purpose a synthetic benchmark like this does serve, I think they did the correct thing. I guess I just wonder exactly what the point of it is. Like, the optimized Rust example uses around 0.12 KiB per task. That's extremely cool, but where in the real world are you going to find tasks where the actual state doesn't completely eclipse that metric? Meanwhile, Go is using around 2.64 KiB per task. 22x larger than Rust as it may be, it's still not very much. I think for most real world cases, you would struggle to find too many tasks where the working set per task is actually that small. Of course, if you do, then I'd reckon optimized async Rust will be a true barn-burner at the task, and a lot of those cases where every byte and millisecond counts, Go does often lose. There are many examples.[1] In many cases Go is far from optimal: Channels, goroutines, the regex engine, various codec implementations in the standard library, etc. are all far from the most optimal implementation you could imagine. However, I feel like they usually do a good job making the performance very sufficient for a wide range of real world tasks. They have made some tradeoffs that a lot of us find very practical and sensible and it makes Go feel like a language you can usually depend on. I think this is especially true in a world where it was already fine when you can run huge websites on Python + Django and other stacks that are relatively much less efficient in memory and CPU usage than Go. I'll tell you what this benchmark tells me really though: C# is seriously impressive. [1]: https://discord.com/blog/why-discord-is-switching-from-go-to... |
> I'll tell you what this benchmark tells me really though: C# is seriously impressive.
The C# team has done some really great work in recent years. I personally hate working with it and it's "magic", but it's certainly in a very good place as far as trusting the CLR to "just work".
Hilariously I also found the Python benchmark to be rather impressive. I was expecting much worse. Not knowing Python well enough, however, makes it hard to really "trust" the benchmark. A talented Python team might be capable of reducing memory usage as much as following every step of the Go concurrency tour would for Go.