Hacker News new | ask | show | jobs
by voidlogic 4697 days ago
When I see someone doing something like this and:

  1. They are not using the latest Go (atm 1.1.2)
  2. GOMAXPROCS is not = the number of CPUs
  3. They are using ab rather than something more scalable like wrk
I assume they either don't know what they are doing, or want to make Go look bad.

On a side note, Go is already known to be much faster at web-serving than node.js: http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...

1 comments

I did the test that the article suggested, with the versions of the tools that I have installed on my system. This is a comment on a blog article, not an attempt to engineer the perfect benchmark.

Also in this bench Go danced circles around node. I dunno what you're complaining about.

>I dunno what you're complaining about.

I'm complaining because you and other people will go on to:

  1. Use a version of Go in your development that is slower (as has much worse memory characteristics) and lacks new features.
  2. End up running all your programs on single core until you understand GOMAXPROCS
  3. Use ab to bench real things which is bad
So "my complaining" is trying to help you.
I agree with Voidlogic here. Perhaps his tone was a little confrontational, but his intentions were good. :)

    Go 1.1 > Go 1.0.2
    wrk > ab
In particular, ab should be avoided whenever possible. Apache Bench (ab) remains a single-threaded tool, meaning that for high-performance servers in particular, your exercise will run into the limits of Apache Bench before the limits of the server(s) being tested. The LigHTTP team has a multi-threaded clone named WeigHTTP that I would recommend if you want something that is functionally similar to ab and uses similar command-line arguments.

Wrk uses a slightly different argument syntax from ab and WeigHTTP but has some upsides:

1. Wrk is also multi-threaded.

2. In our experience, wrk is slightly higher-performance than WeigHTTP (~5 to 10%).

3. Wrk provides average, maximum, and stdev for latency.

4. Wrk provides a time-limited mode (rather than request-count limited), which is appealing for some test types.

In my experience, as long as you configure Go and node to use all of your cores, Go will benchmark better than Node in any permutation of these configuration variables:

    Go 1.0.2 and node tested with ab.
    Go 1.1 and node tested with ab.
    Go 1.0.2 and node tested with wrk.
    Go 1.1 and node tested with wrk.
V8 is a very fast JavaScript runtime; node.js is modestly quick at handling HTTP requests. But among the many features of Go is a high-performance HTTP package. If you've used both, it isn't all that surprising that Go's performance clocks in higher than node.
Is this the wrk you're referring to:

https://github.com/wg/wrk

Yes. Sorry for not providing the link!
> Perhaps his tone was a little confrontational

Sorry, that was not intended.

>but his intentions were good. :)

They really were...

>> wrk > ab

Can you please expand on why? I recently bumped on wrk and am in process of evaluating switch from ab, thank you

daemon13, you might also be interested in reading this thread: https://news.ycombinator.com/item?id=6114282
thank you!
Sure. I just edited my message above.
Thank you!

You've got cool blog, esp. prior posts selection!!

Can you help me by telling my why the way I used GOMAXPROCS is wrong, and how to use it correctly?
Not so much wrong as my impression was you didn't understand it. If that is not the case I apologize.

For example you said: GOMAXPROCS left default. I don't know how you set your environment vars, are they unset so default = 1? You didn't mention in your post that GOMAXPROCS=1/single node worker test cases are really toy test cases (useful only for benchmarking). So if you know everything below, then great! Maybe other people can learn:

GOMAXPROC is the number of OS level threads that the Go runtime is multiplexing Go tasks (goroutines) over.

So if GOMAXPROCS = 1, When one goroutine blocks, another will run, BUT, you will never use more than one OS thread and thus you will never use more than one logical core.

Setting GOMAXPROCS correctly is per application. For example GOMAXPROCS=1 might be right for a commandline tool or a program that was designed to have multiple instances started on the same machine. That being said, a vast majority of the time any high load application I have written is best with GOMAXPROCS=<# of logical CPU cores>. So Go always has concurrency, but GOMAXPROCS gives it parallelism. GOMAXPROCS > 1 also will allow the garbage collector to have more parallelism too.

So if we are talking about a benchmark like this, ideally we want to process requests made in parallel in a parallel fashion. A clear sign is that if you use Node.js worker cluster you should probably test with Go at the same number.

All this being said, depending on your CPUs implementation details, you would sometimes be better off setting both your node worker count and GOMAXPROC to the number of physical rather than logical cores. Sometimes simultaneous multi-threading (SMT, aka hyperthreading) actually creates more overhead than any concurrency gains it offers.

In short when testing something like this I would always test. 1. n = 1 (with a disclaimer note) 2. n = physical CPU count 3. n = logical CPU count Where n is the number of GOMAXPROCS/node worker threads.

Okay so this is why I was confused by your comment:

I did use GOMAXPROCS with the number of logical cores that I have, and I did test the node cluster with the same number.