Hacker News new | ask | show | jobs
Why Sorted Array Are Processed Faster Than Unsorted Array? (mayankjain.me)
1 points by mayankj08 4671 days ago
2 comments

EDIT: What I wrote here is entirely wrong. I misread the use of 'sorted', and that colored my interpretation of the paragraph. My apologies.

"Well, this is due to something called as Branch Prediction. Branch Prediction is a strategy used in processors to decide if the branch would be taken or not."

Emphatically false. Python's sort is based on timsort. Timsort makes the observation that most real-world lists are partially sorted; either forward or reverse. This can help sorting go faster. The observed timing difference is nearly all to do with that, and not with branch prediction.

Consider the Shell sort. It's O(n*n) for an unsorted list and O(n) for a sorted list. Again, nothing to do with branch prediction and everything to do with how well the data fits the algorithm's expectations.

How sorting came into picture here?

I am just looping onto a list how is this related to complexity of sorting etc..

Sorry, but I didn't understood you properly.

My apologies. I misread the code and what it did, because of the use of the word 'sorted' and my belief that branch prediction should not affect this code.

I still don't believe it's due to branch prediction. I believe it's due to cache coherency.

Here's a more extensive example, where I shuffle the last N elements:

    import time
    from random import shuffle

    x = [i for i in range(1000000)]
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "ref", toc-tick, sum1

    for N in (0, 250000, 500000, 750000, 1000000):
        y1 = range(0, N)
        y2 = range(N, 1000000)
        shuffle(y2)
        y = y1 + y2
        tick=time.clock()
        sum1=0
        for i in y:
            sum1=sum1+i
        toc=time.clock()
        print "shuffle last %d: %f %d" % (len(x)-N, toc-tick, sum1)

    assert x == y, "These should be identical when N==1000000"

    print "== Two summations of the same array =="

    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "ref (again)", toc-tick, sum1

    tick=time.clock()
    sum1=0
    for i in y:
        sum1=sum1+i
    toc=time.clock()
    print "final range (again)", toc-tick, sum1
When I run the above, the output is:

    ref 0.158759 499999500000
    shuffle last 1000000: 0.254425 499999500000
    shuffle last 750000: 0.228932 499999500000
    shuffle last 500000: 0.252207 499999500000
    shuffle last 250000: 0.224307 499999500000
    shuffle last 0: 0.255656 499999500000
    == Two summations of the same array ==
    ref (again) 0.158914 499999500000
    final range (again) 0.250253 499999500000
As you can see, the last two tests use arrays x and y where x == y == range(1000000), but their timing numbers are quite different. Branch prediction shouldn't cause that difference. Poor cache behavior might.

Here's an attempt to double-check my view. I use an array.array(), which stores the numbers sequentially both for the sorted and random versions:

    import array
    import time
    from random import shuffle

    x = [i for i in range(1000000)]
    x = array.array("i", x)
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "sorted", toc-tick, sum1

    x = [i for i in range(1000000)]
    shuffle(x)
    x = array.array("i", x)
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "unsorted", toc-tick, sum1
When I run the above, I find that both timings are essentially identical:

    sorted 0.159446 499999500000
    unsorted 0.161541 499999500000
This gives very strong evidence (unless I made another mistake) that branch prediction does not play a role, while cache prediction should be the actual suspect.
That link now says "Page Not Found". What happened to it?