Hacker News new | ask | show | jobs
by squires 5086 days ago
Anyone have any insight as to how Go will avoid the garbage collection latency issues that arise in some situations with other GC languages, such as those that run on the JVM or CLR? I ask this in particular since Go is frequently mentioned as a C++ alternative.
5 comments

An "alternative" doesn't mean "has to be as good or better at everything that C++ does".

Most programs don't have garbage collection latency issues. Look at Android or Windows Phone - all the user-level apps are written in garbage-collected language and they work just fine.

Look at web - all the interaction is written in JavaScript and Gmail works just fine.

At this point in the discussion people usually bring "real time" applications. Approximately no-one writes those applications.

Go is an alternative to C++ in the sense that many (but not all) programs that you had to write in C++ you can now write in Go.

> Look at Android or Windows Phone - all the user-level apps are written in garbage-collected language and they work just fine.

Aren't there people blaming Android's perceived laggyness on this?

Many do, but actually the problem is caused by UI using software rendering instead of the GPU. This was only changed in Android 3.0.

Another thing is that many developers do too much in the UI thread instead of doing it in the background or asynchronously.

Exactly - most programs don't generate all that much garbage. We've been working on GC for 40 years now, unless you're being reckless you probably won't notice it.
I've been dealing with this in quite a bit of detail in the last week. The answer is: it is entirely possible to write a Go program (or sub-programs, which is even better: use GC when you need it) that produces absolutely no garbage, and it doesn't require being a seer to do so. It requires a bit more seer-ness to leverage the fairly simpleminded escape analysis to make some things more terse, but that's a fairly small workflow optimization.

The biggest hurdle I've had is that the standard library can make it difficult to avoid allocation: for example, I want to re-initialize a "bytes.Reader" or "bytes.Buffer", but I can't just pass a new slice and reset the data structure fields, from what I know. As a result, I've had to do the somewhat unpleasant task of pulling in standard library files and then tweaking them, or writing new abstractions to do some low level stuff entirely -- perhaps unavoidable, it is probably better to have a nice interface for general cases from a standard library perspective.

It is worse when you realize that a standard library function calls "make" in a place where it'll be in your inner loop: you really wish you could pass the memory to scribble into instead.

Another surprisingly expensive thing is the dynamic dispatch on interfaces. This is easy to avoid if one just writes a specialized version of a function accepting all the concrete types in the common case, though. Beyond that, memory copying, something I've found that the bytes.Buffer and bytes.Reader make hard to avoid while still capturing their useful semantics.

All in all: more pleasant than C for most projects, and very nearly as fast when like this generally, so even if you end up having to write a good chunk of stuff yourself you are no worse off than if you had chosen C to begin with.

A very notable caveat: you can't handle in any way (afaik) failed requests for virtual memory, which is rather a killer for some kind of programs. This one makes me a bit annoyed, but I can see why it's difficult to address. That doesn't make me feel much better, though.

To give a sense of what I'm doing: I'm parsing very small messages (minimum: 5 bytes, and often quite small, a common pathological case would be a handful more bytes) and passing them on with no interesting processing, and trying to get this to be fast in the trivial case. This is the Postgres wire protocol. So far, Postgres is still beating the crap out of me, which is annoying considering it actually has to do some work, and I'm just inspecting one byte and one 4-byte word of the message and then passing it on. At this point, with two copies the program is ~20% runtime.memmove, so the next high pole is to eliminate one of the copies.

Despite the comparisons to C and C++, Go is not really a replacement for either of those languages. Each supports features that are not found in Go, and that are necessary for certain applications.

Go is more of a replacement for Python, Ruby, etc. The level of abstraction is very similar. The design of interfaces provides duck-typing. Obviously, Go is not exactly equivalent, but I find that Go is, in many ways, more similar to dynamic languages than it is to either C or C++.

This (among other topics) comes up frequently in Go discussions.

No one should say that Go is a replacement for C/C++ for all projects, however, it could be argued that most new, typical projects could be used with Go as opposed to C/C++/Python/etc.

There must be very few languages in the history of programming languages that could be considered complete replacements for others and that does not take anything away from those languages.

What would be such cases?

If you are thinking about writing operating systems, there are quite a few research operating system written in GC enabled languages as proof of concept.

If you are talking about the amount of control over the machine, or the set of abstractions available to the programmer, then I fully agree with you.

D or Rust would be a better choice, eventually.

Go's model is such that, compared to JVM-based languages, you have the control to create (much) less garbage in the first place, greatly reducing the size and impact of garbage collection pauses.
> Go's model is such that, compared to JVM-based languages, you have the control to create (much) less garbage in the first place, greatly reducing the size and impact of garbage collection pauses.

I have only heard of escape analysis (which Java has) and structs inside structs (which C# has; C# also lets you put structs on the stack). Is there anything that Go has over Java and C#?

Go lets you pass around raw values without overhead. An int32 is 4 bytes. A struct { a, b int32 } is eight bytes. A [4]int32 is 16 bytes. A [4]struct{ a, b int32 } is 32 bytes. (I think you get the point :-)

In Java all objects are passed by reference, which causes bloat and cache locality issues. Also there are bookkeeping costs associated with even trivial objects: a simple array carries around an additional 16 bytes of memory.

Furthermore, the core Java libraries make it very difficult to write code that doesn't generate a lot of garbage. In fact, it's very difficult to write allocation-efficient code in Java without writing very unidiomatic Java code. These problems are worsened in more dynamic JVM-based languages like Scala and Clojure, which generate huge amounts of garbage due to runtime reflection.

Here's an interesting discussion of these issues:

http://loadcode.blogspot.com/2009/12/go-vs-java.html

> dynamic JVM-based languages like Scala and Clojure

Scala is a statically typed language

Go may make more efficient use of memory for the cases you describe, but the JVM still beats the pants off Go:

http://shootout.alioth.debian.org/u64q/benchmark.php?test=al...

Also, you haven't factored in Java's escape analysis and on-stack allocation.

The topic of this discussion is garbage collection and memory allocation. The graphs on that page show Go beating Java in memory usage in every case. I'm not sure what your point is.

Scala is a statically typed language but it must do runtime type reflection to implement some of its features on top of the JVM. That comes at a cost (and in fact we decided not to implement Go on the JVM for this exact reason). I know of a certain well-known company that uses Scala heavily and their JVM instances spend >80% of their CPU time in garbage collection.

The JVM has had a tonne of optimization done to it. The Go compilers and runtime have had barely any. There's plenty of low-hanging fruit: recent changes to the gc code generation have yielded as much as 2x speedups in certain operations.

My observation, from watching very skilled Java programmers build and deploy programs, is that garbage generation and collection latency cause serious problems. My observation of similar Go programs is that these kinds of problems don't really come up.

> Go lets you pass around raw values without overhead.

Does that rely on heuristics like escape analysis or is it user defined (and predictable)?

In C#, it's up to the person defining the type to decide whether it has value/stack semantics (a struct) or not (a class). Structs have some limitations regarding constructors, and being passed by value is surprising to users used to classes. For these and other reasons, the vast majority of types are classes.

So you can have on-stack or embedded objects, but they are fairly rare.

In Go, the user of a type decides whether it will be used by reference (on the GC heap) or directly on the stack or embedded in an object: you can take a pointer to any type, or use the type directly.

That gives the consumer of a type much greater flexibility to not create garbage if they don't want to.

> In Go, the user of a type decides whether it will be used by reference (on the GC heap) or directly on the stack or embedded in an object: you can take a pointer to any type, or use the type directly.

Actually the distinction between heap and stack is not determined by how it is referenced. The compiler is free to stack-allocate any value as long as it does not escape the function. We can do this because pointers are opaque; there's no arithmetic.

The main point is that Go does not have classes. You just define methods on values. Values are no bigger than the data they represent, so you don't suffer from the same kinds of overheads seen in other "OOP"-centric languages.

How is that even possible? Surely there must be a type tag associated with each object if you can dispatch on it, just like in other languages.
A variable of an interface type is a pair consisting of a type descriptor and a concrete value. It is only when a concrete value of a statically known type is passed to something expecting an interface type that such a type tagged value is constructed.
For interface values there is a type associated with each value.

For normal values you don't need it, because Go is statically typed. The compiler knows where the method is.

Interesting, thanks!
But it ended up as an alternative to Python, of course language-wise. Library-wise, it is still far far away.
IMO Go's standard library (http://golang.org/pkg/) is at least as complete as Python's for everyday programming and includes some things that Python's standard library doesn't such as crypto, image processing and a production ready HTTP server.

Right now I'd say Python has an advantage in comprehensive frameworks such as Django or Numpy but I think it's just a matter of time before similar frameworks emerge for Go.