Go (the language) have at least two implementations: the official Go implementation and GCC (yes, Go is included in GCC, along with Fortran and Ada).
The latest Go implementation (Go 1.7) has made Go a lot faster. I would argue that it closer the speed of executables generated with GCC (gcc/g++) than OpenJDK, the Oracle JVM, Mono or the .NET compiler for C#.
Go (the language) can be made just as fast as C (the language), for many cases. Go has the advantage of making it much easier to use multiple processors, though.
> The latest Go implementation (Go 1.7) has made Go a lot faster. I would argue that it closer the speed of executables generated with GCC (gcc/g++)
Go 1.7 performs nowhere near the set of optimizations that GCC and LLVM do. GCC/LLVM have a huge number of algebraic simplifications (InstCombine), aggressive alias analysis, memory dependence analysis, instruction scheduling, an optimized instruction selector, a highly tuned register allocator with stuff like rematerialization, SCCP, etc. etc. It will take years and years for Golang to come close.
> Go (the language) can be made just as fast as C (the language), for many cases.
No, it can't. The M:N scheduling model will always have some overhead relative to 1:1 if you don't need the performance profile of C10K-style servers. The dynamic semantics of "defer" is an unavoidable performance tax over RAII. Unwinding is mandated by the language, inhibiting some optimizations. There is little control over allocation: language constructs allocate in ways that are not immediately obvious. The fact that interfaces result in huge numbers of virtual calls results in a good amount of overhead that (unlike Java) Go can't even eliminate with inline caching, because it's AOT compiled. This is just off the top of my head.
> Go has the advantage of making it much easier to use multiple processors, though.
Not really. Go's parallelism primitives are just as low-level as those of C. The "one size fits all" scheduling algorithm is a poor fit for getting the most performance out of multicore. The lack of generics is a real problem: it prevents you from using optimized concurrent data structures without paying the tax of interface{} or going through code generation hoops.
In any case, the lack of SIMD basically kills Go's applicability in these domains.
Everything you wrote is true, but I think you're overstating the performance cost of Go design and implementation choices. I think the parent comment is fair in saying that Go is somewhere between GCC and the JVM in terms of performance. But I agree that Go is not designed for extreme performance (the kind of software where even a 1% gain matters), and that C/C++/Rust are better for these purpose.
Until you have control over stack/heap and data locality you're never going to be able to approach C/C++/Rust speeds.
Conversely if you're using C/C++ through a ton of heap/virtual pointers then you're losing a lot of the value the language brings and should be using something higher level.
Go allocates on the heap unless it can prove something doesn't escape, in which case it's on the stack. Not explicit programmer control, but I think you can reasonably make it do what you want.
Because it exposes pointers as a first-class concept, you also have good control of how data is laid out in memory (=> locality).
It's not like Python or Java where everything is a pointer and gets spread out all over memory.
> Not explicit programmer control, but I think you can reasonably make it do what you want.
Escape analysis, like any such analysis, gets much more difficult in the presence of higher-order control flow. Currently the Go compilers punt on higher order control flow analysis. And Go uses higher-order control flow in spades, due to its heavy reliance on interfaces.
The end result is that lots of stuff is heap allocated.
> Java where everything is a pointer and gets spread out all over memory.
That's not true for Java. Its generational garbage collector performs bump allocation in the nursery, yielding tightly packed objects with excellent cache behavior. Allocation in HotSpot is like 3-5 instructions (really!)
I think the HotSpot approach makes the most sense: instead of trying to carve out special cases that fall down regularly, focus on making heap allocations fast, as you'll need to make them fast anyway. After that, add things like escape analysis (which HotSpot has as well).
Java: But you're still chasing pointers for an array of objects right? Vs being able to just say, "I want this array to be X objects, all laid out in in a row in memory." I'm not a java programmer, but I'm pretty sure I've seen code that used primitive types rather than classes to get around this.
Actually, even Go isn't helping as much as it could here -- sometimes you want to have an array of objects that lays out each column (field) of memory contiguously, which Go gives you no easy way to do. But then neither does C or C++.
Isn't allocation just a couple instructions for basically GC languages?
> Isn't allocation just a couple instructions for basically GC languages?
If those GC'd languages have a precise generational GC with bump allocation in the nursery. Go doesn't (and the proposed GC design doesn't allow for this, unfortunately).
> That's not true for Java. Its generational garbage collector performs bump allocation in the nursery, yielding tightly packed objects with excellent cache behavior. Allocation in HotSpot is like 3-5 instructions (really!)
That post's numbers are entirely based a giant multi-gigabyte long-lived array: the classic worst case for a generational GC. That is not representative of most memory allocations. The generational hypothesis, which has been empirically verified in real world code again and again, is that most allocations are short-lived and small.
Can you do an arena allocator in Go with disparate types? If not then you're really missing out on data locality.
In also not a huge fan of a compiler "automatically" performing escape analysis. Makes a single change causing cascading perf problems very easy and hard to catch.
no, not with disparate types (maybe if you resort to weird tricks with unsafe). i'd be curious to hear a use-case for this that ends up being different than just 'normal gc allocation' (not doubting you, just curious).
Two different things here: previous versions of the compiler got slower as it was moved, in automated fashion, from C to Go. Code compiled with Go was not slower. The new version of Go has both improved compile time (though not quite to the speed of the first few versions of the compiler) and improved code generation, so code is faster than any previous version of Go.
The latest Go implementation (Go 1.7) has made Go a lot faster. I would argue that it closer the speed of executables generated with GCC (gcc/g++) than OpenJDK, the Oracle JVM, Mono or the .NET compiler for C#.
Go (the language) can be made just as fast as C (the language), for many cases. Go has the advantage of making it much easier to use multiple processors, though.