I think the expectation the Go lot have been working towards is the expectation that individual pauses are very short. Contrast to a Java server GC, aiming towards overall efficiency. (Tell me if I'm wrong, I'm no expert.)
As another commented posted elsewhere, GC is all about tradeoffs, and for a govern set of GC tradeoffs there will always be pathological cases. Another GC might work out of the box for this workload, but perform very badly for a workload at which Go's GC excels. As others have already mentioned, the fact that he was so easily able to tune the GC is a testament to Go's simplicity.
There are lots of good criticisms of Go to be had, but it's runtime is pretty remarkable, in my opinion.
As another comment in this thread explained, GCs have CPU and Memory tradeoffs and you can easily make a GC that would work excellent in the OP benchmark but would suffer severely under other workloads.