Hacker News new | ask | show | jobs
by derdi 688 days ago
Oops, this had a performance bug. Instead of:

    if d.length = Array.length d.values then begin
      d.values <- Array.(append d.values (make (length d.values) x))
    end;
the array reallocation should actually be:

    if d.length = Array.length d.values then begin
      let new_array = Array.make (Array.length d.values * 2) x in
      Array.blit d.values 0 new_array 0 (Array.length d.values);
      d.values <- new_array
    end;
otherwise we allocate about a third more memory than needed. It's telling that even with this performance bug the dynamic array was broadly better than lists.

New results for the previously slowest cases:

    -- 25_000_000 elements --
    list:        0.977002 sec
    list:        0.963903 sec
    list:        0.950473 sec
    dynarray:    1.476165 sec
    dynarray:    1.281724 sec
    dynarray:    1.343222 sec
    my dynarray: 0.872558 sec
    my dynarray: 0.755902 sec
    my dynarray: 0.753746 sec
    
    -- 50_000_000 elements --
    list:        1.914777 sec
    list:        1.886989 sec
    list:        1.542614 sec
    dynarray:    2.922376 sec
    dynarray:    2.783559 sec
    dynarray:    2.537473 sec
    my dynarray: 1.725873 sec
    my dynarray: 1.545252 sec
    my dynarray: 1.515591 sec
    
    -- 75_000_000 elements --
    list:        2.827154 sec
    list:        2.835789 sec
    list:        2.318733 sec
    dynarray:    4.354404 sec
    dynarray:    4.150271 sec
    dynarray:    3.781488 sec
    my dynarray: 1.887360 sec
    my dynarray: 1.929286 sec
    my dynarray: 1.814873 sec
This turns an uneasy head-to-head into a clear win for dynamic arrays. Honestly, how could it be otherwise?
2 comments

> Honestly, how could it be otherwise?

It can't be otherwise if you're just assuming a straightforward compilation model where your written roughly reflects the assembly code that's generated. This just isn't the case with better compilers though.

For instance, Haskell can often perform rewrites or fusion passes that entirely eliminates the loop and all intermediate data structures, effectively giving a near-infinite speedup compared to alternatives. You typically can't perform such optimizations when side-effects are in the middle of the computation, for numerous reasons.

You could have backed this statement up with a benchmark, and you chose not to.

(You are right on the general hand-waving level that such optimizations are sometimes possible. But there are no fuseable intermediate data structures in this particular problem.)

But they are still very much in the same order of magnitude... pretty impressive that these solutions are all in the same ballpark, I would've expected much bigger differences.
The dynamic array spends most of its time copying old elements as it grows exponentially. If you pre-size it to the right size, you eliminate this copying, and the difference becomes 5x. In practice you often have an idea of the size, at least as a rough estimation, so you would win by a larger margin. But that was not part of the spec.

Also, other ways of organizing dynamic not-quite-an-array-but-not-quite-a-linked-list data structures exist. It could be a dynamic array (or linked list) of dynamic arrays to eliminate repeated copying of the oldest elements.