Hacker News new | ask | show | jobs
by frankmcsherry 2212 days ago
Thanks for the interesting post! I liked the MergeDouble approach which I hadn’t seen before. Couldn’t that be extended to finding the median of the results using binary search and then doing four concurrent merges, for even more ILP?
1 comments

Yes that is very much possible. In general you won't get 4 equal length ranges though. For instance in an already sorted array the median results in (n/2,0) and a (0,n/2) division. But the average case should be two sets of merge intervals both of size (n/4, n/4). In reality I think you will quickly run into other limits. You need quite a bit of registers, that shouldn't spill to stack.
I think the median of the results works great in any case (but, you are the expert here). If we find the result median in each of the two inputs, say 'a' and 'b' we have four merges (two forward, two reverse):

    [0, ..) w (0, ..)
    (.., a) w (.., b)
    [a, ..) w [b, ..)
    (.., n) w (.., n)
each of which should produce n/4 elements and then meet.

If the array is already sorted then great; that split will just "merge":

    [0, ..) w nothing
    (.., n/2) w nothing
    nothing w [n/2, ..)
    nothing w (.., n)
which should go very fast. Anyhow, possible I'm missing something!
No you are not missing something. This is, I think, correct. I haven't tried it but it seems possible and possibly very effective.