| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dalke 4671 days ago

My apologies. I misread the code and what it did, because of the use of the word 'sorted' and my belief that branch prediction should not affect this code.

I still don't believe it's due to branch prediction. I believe it's due to cache coherency.

Here's a more extensive example, where I shuffle the last N elements:

    import time
    from random import shuffle

    x = [i for i in range(1000000)]
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "ref", toc-tick, sum1

    for N in (0, 250000, 500000, 750000, 1000000):
        y1 = range(0, N)
        y2 = range(N, 1000000)
        shuffle(y2)
        y = y1 + y2
        tick=time.clock()
        sum1=0
        for i in y:
            sum1=sum1+i
        toc=time.clock()
        print "shuffle last %d: %f %d" % (len(x)-N, toc-tick, sum1)

    assert x == y, "These should be identical when N==1000000"

    print "== Two summations of the same array =="

    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "ref (again)", toc-tick, sum1

    tick=time.clock()
    sum1=0
    for i in y:
        sum1=sum1+i
    toc=time.clock()
    print "final range (again)", toc-tick, sum1

When I run the above, the output is:

    ref 0.158759 499999500000
    shuffle last 1000000: 0.254425 499999500000
    shuffle last 750000: 0.228932 499999500000
    shuffle last 500000: 0.252207 499999500000
    shuffle last 250000: 0.224307 499999500000
    shuffle last 0: 0.255656 499999500000
    == Two summations of the same array ==
    ref (again) 0.158914 499999500000
    final range (again) 0.250253 499999500000

As you can see, the last two tests use arrays x and y where x == y == range(1000000), but their timing numbers are quite different. Branch prediction shouldn't cause that difference. Poor cache behavior might.

Here's an attempt to double-check my view. I use an array.array(), which stores the numbers sequentially both for the sorted and random versions:

    import array
    import time
    from random import shuffle

    x = [i for i in range(1000000)]
    x = array.array("i", x)
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "sorted", toc-tick, sum1

    x = [i for i in range(1000000)]
    shuffle(x)
    x = array.array("i", x)
    tick=time.clock()
    sum1=0
    for i in x:
        sum1=sum1+i
    toc=time.clock()
    print "unsorted", toc-tick, sum1

When I run the above, I find that both timings are essentially identical:

    sorted 0.159446 499999500000
    unsorted 0.161541 499999500000

This gives very strong evidence (unless I made another mistake) that branch prediction does not play a role, while cache prediction should be the actual suspect.