Hacker News new | ask | show | jobs
by jerf 2933 days ago
Per-file may have had other problems not related to the Go runtime, such as IO contention. I'm not going to check it, but it would be easy to verify that just by using a limited number of them at a time. Spawning a new goroutine in that case is not strictly necessary, but would still be good software engineering.

One of the problems I see repeatedly when people try to benchmark things with concurrency is when they don't specify a problem that is CPU-intensive enough, so it ends up blocked on other elements of the machine. For a task like this, I'd expect optimized Go to easily keep up with a conventional hard drive, and with just a bit of work, come within perhaps a factor of 2 or 3 of keeping up with the memory bandwidth on a consumer machine (including the fact that since you're going to read a bit, then write some stuff, you're not going to get the full sequential read performance out of your RAM), not because Go is teh awesomez but because the problem isn't that hard. To get big concurrency wins, you need a problem where the CPU is chewing away at something but isn't constantly hitting RAM or disk or network for it, such that those systems become the bottleneck.

1 comments

Hi jerf, please note that

- the benchmark was designed to repeatedly parse an in-memory byte slice (not the hard drive), thus IO contention is unlikely here ;

- concurrency is a big win when IO is a bottleneck : keep processing dozens of things while some of them are waiting for data from network or hdd.

"the benchmark was designed to repeatedly parse an in-memory byte slice (not the hard drive), thus IO contention is unlikely here"

You could still be getting IO contention from the RAM system. RAM is not uniformly fast; certain access patterns are much faster than others.

"concurrency is a big win when IO is a bottleneck : keep processing dozens of things while some of them are waiting for data from network or hdd."

Concurrency is a win when IO is a bottleneck on a single task. Once you've got enough tasks running that all your IO is used up, adding more may not only fail to speed things up, but may slow things down. I'm speaking of situations where you've used up your IO. The tasks you're benchmarking are so easy per-byte that I think there's a good chance you used up your IO, which at this level of optimization, must include a concept of memory as IO.

I think you'd be helped by stepping down another layer from the Go VM and thinking more about how the hardware itself works regardless of what code you are running on it. Go can't make the hardware do anything it couldn't physically do, and I'm getting a sense you deeply understand those limits.