Hacker News new | ask | show | jobs
by kjksf 3021 days ago
String benchmarks are so broken.

They way he uses b.N is wrong. b.N is different for different loops so he's e.g. timing 100 iterations of string '+' with a 1000 iterations of builder.WriteString()

Also the compiler can completely null out no-op functions (without side effects) so in benchmarks it's a good idea to assign the value being calculated into e.g. a global variable.

The corrected code is: https://gist.github.com/kjk/6a7d7135ae1e5fa6cd1f0db23d2eaf4d

An example of correctly benchmarking:

    func BenchmarkConcatString(b *testing.B) {
	for n := 0; n < b.N; n++ {
		var str string
		for i := 0; i < 100; i++ {
			str += "x"
		}
		gStr = str
	}
    }

After fixes it paints significantly different picture:

    go test -bench=. -benchmem
    goos: darwin
    goarch: amd64
    BenchmarkConcatString-8    	  300000	      5148 ns/op	    5728 B/op	      99 allocs/op
    BenchmarkConcatBuffer-8    	 1000000	      1046 ns/op	     368 B/op	       3 allocs/op
    BenchmarkConcatBuilder-8   	 1000000	      1177 ns/op	     248 B/op	       5 allocs/op
2 comments

His use of b.N is correct. The code you have is simply multiplying N by 100 with the inner for loop - so your times are 100x of what each "concat" operation (+,WriteString) takes.

You are also allocating a new string/buffer/builder for every run - which is not useful if you want to just benchmark concat.

Thanks for pointing it out. Should clearly not depend on the number of iterations. It's fixed now.
I think there's another bug in the generateSlice function if the intention is to create a slice with n random numbers.

    func generateSlice(n int) []int {
        s := make([]int, n)
        for i := 0; i < n; i++ {
            s = append(s, rand.Intn(1e9))
        }
        return s
    }
As it is now, the function creates a slice with n zeros followed by n random numbers. I suppose you meant to say make([]int, 0, n). You could just as well assign directly to each slice element instead of using append, which would be more efficient.

I made the exact same mistake quite a few times myself.

Yep, that meant to be capacity, not length. Corrected. Thanks!