Hacker News new | ask | show | jobs
by ultimoo 4434 days ago
I think growing a string with << is an expensive operation. Since once the allocated memory is full, it will cost O(n) to copy the entire string to a new location. If we are carrying out m insertions, this could take O(mn).

The way I have been building strings is to store the m strings in an Array and then join() it at the end (which should be only O(n)). The m small objects need to be allocated in both cases, the only difference being that we hold on to those and join them later on.

Am I missing something? I have never measured the performance on either of the these operations so I could be off the mark here.

1 comments

For the cases I have benchmarked concatenation with << has been faster than join on an array.

Here's a simple benchmark:

    require "benchmark"

    n = 100_000
    Benchmark.bm(4) do |x|
      x.report("<<") do
        n.times do
          "aaaaa " << "bbbbbb " << "ccccc " << "ddddd " << "eeeee " << "fffff"
        end
      end

      x.report("join") do
        n.times do
          ["aaaaa", "bbbbbb", "ccccc", "ddddd", "eeeee", "fffff"].join(" ")
        end
      end
    end
The results I get for this are:

               user     system      total        real
    <<     0.140000   0.000000   0.140000 (  0.143750)
    join   0.230000   0.000000   0.230000 (  0.228035)
I'd be interested to see if there were any use cases where the relative performance was reversed.

    require "benchmark"
    n = 100_000

    A = "foo " * 20
    B = "bar " * 20
    C = "baz " * 20
    D = "foobar " * 20
    STRINGS = [A, B, C, D]

    Benchmark.bm(4) do |x|
      x.report("<<") do
        n.times do
          "" << A << B << C << D
        end
      end

      x.report("join") do
        n.times do
          STRINGS.join
        end
      end
    end


               user     system      total        real
    <<     0.090000   0.010000   0.100000 (  0.102411)
    join   0.070000   0.010000   0.080000 (  0.071142)