Yeah, it all depends on code authors. This is one more example of an mqtt broker in Erlang - VerneMQ; and it's very readable: https://github.com/erlio/vmq_server We are currently estimating it.
I really can't imagine that this benchmark is relevant anymore. Haven't Go, D, and Erlang had major changes since then? E.g., Go is now compiled by a compiler written in Go.
I'm planning on updating it this year once I finally find the time to learn Rust. Developing an MQTT broker has become my go-to task for learning new languages.
How Go is compiled isn't as relevant to its performance as much as what Go is compiled into, which hasn't changed all that much as far as I know. The GC improvements might help with latency though.
The translation of the compiler from C to Go was a mostly mechanical transformation, specifically to prevent any changes in the resulting compiled programs. Go compilation is getting better all the time, but the switch itself did not change the resulting programs.
An article from 2013 means that it covers a change from at least Go version 1.2 (released 2013/12/01)) to 1.5 (released yesterday?). The performance section of each document talks about cases where it may go slower for faster in certain instance in every since release, including the one that was the change from C to Go for the compiler[2].
That said, I took the original statement about Go switching from Co to Go to indicate there's been major changes in the compiler, so it would be interesting to see more recent results, not that the change itself necessarily was responsible for major speed improvements, but they could very well have assumed the compiler change would have had a larger affect on the resulting binary depending on their understanding of the Go toolchain.
It's still hard to take any benchmark seriously when given the quote:
> Mosquitto was compiled with gcc 4.8.2, the Go implementation was executed with go run, the D implementation was compiled with dmd 2.0.64.2 and the Erlang version I’m not sure.
The "I'm not sure" speaks for itself, but go run also includes both compilation time and execution time and they're comparing it against just the execution times of the other languages. That's not exactly an apples to apples comparison.
I didn't write the Erlang version, don't know Erlang nor have any idea how it's built. The curious can always check the code out.
I don't know why you think that start-up time has an effect on the benchmarks. It doesn't matter how long the brokers take to start, once they did the measurements were done. `go run` doesn't change a thing.
"Since the Erlang unit tests are in the same files as the implementation, it’s hard to know exactly how many lines long it is. It gets worse since it implements most of MQTT, the D implementation essentially only implements what’s necessary to run the benchmarks."
Benchmarking the entire spec versus only the minimal set is almost certainly part of the problem here. If you want to benchmark implementations against each other, you should probably make sure they implement the same thing!
Are you sure about that one? I am thinking about the situation where presence of alternative code paths that never actually get executed can lead to fairly large differences in timings, particularly for tight loops. (At least in computational code; I'd expect it to be much less common for protocol handling benchmarks like these...)
~20kSLOC IIRC. It's not a fair comparison since it does more, but let's face it, if these implementations did the exact same thing I doubt they'd pass the 5kSLOC mark.
did you use GOMAXPROCS=$YOUR_CORES ? or just the standard in 1.2? Actually golang is really good in multithreading. Next you also have a Java example, did you warm it up or not? Actually thats why I hate benchmarks, since mostly they are screwed since most people who are writing benchmarks actually knowing just a little of each language. (actually they just want to show that "their" language is good)
Back then I tried GOMAXPROCS from 1 to 8 and it didn't make much of a difference. Later on with a different version of Go (I don't remember which), increasing GOMAXPROCS to 2 made a difference, but any more than that was pretty much the same. In any case (and again, the last time I tried), the Go implementation with GOMAXPROCS=2 was the slowest even though it was using twice as many threads as the other ones!
As for showing "their"/"my" language: there were implementations from 4 different sources, and I didn't write the benchmarks; the guy who wrote the Go implementation did (and that's why the benchmark app is in Go).
It seems this article was posted in 2013. Not sure where that falls in the Go release timeline, but I'm guessing the version wasn't was far behind then as it seems now.
Edit: After a quick search, it looks like Go 1.2 was released only 4 days prior to this article being posted.
The Erlang code looks very nice. If you read this, great work Patrick!
https://bitbucket.org/pvalsecc/
Nice use of gen_fsm + binary matching.
Here is an example of the client code that takes only 200 lines:
https://bitbucket.org/pvalsecc/erlangmqtt/src/f37505188c1f1c...