Hacker News new | ask | show | jobs
by thaumasiotes 484 days ago
> The number of swaps is nearly the same between Bubble Sort and Can't Believe Sort. For a random list with 100 elements, it's 2505 swaps vs. 2513 swaps.

Why is there any difference? Does the entire difference come from the iteration where i = 1?

> Bubble Sort always does exactly n(n-1)/2 comparisons

Bubble sort can do less than this. When you're bubbling down into a sorted list, and you find a value lower than you are, you can assume that all values below that one are also lower than you are and terminate the inner loop early.

1 comments

> Why is there any difference? Does the entire difference come from the iteration where i = 1?

Sometimes it's more swaps, and sometimes it's less. The vague pattern looks like Can't Believe Sort does slightly fewer swaps on larger lists, but that could also have been noise.

Why do you expect them to have exactly the same number of swaps?

> Bubble sort can do less than this.

Oh sorry yeah, I implemented bubble sort without any sort of early return, for a more direct comparison.

I expect them to have exactly the same number of swaps in every iteration of the loop past the first one, because they are doing exactly the same thing. It's not just the same number of swaps. It's essentially the same set of swaps, in a different order. In the iteration where i = 1, the number of swaps may differ. Was I wrong about that?
Yeah, 2410 vs. 2509 swaps not including the first iteration. You should try measuring these things yourself, I'm not sure if it's doing what you think it's doing.
Well, I sort of agree with you and sort of disagree.

    from random import randrange

    def list_swap(lst, i, j):
        temp = lst[i]
        lst[i] = lst[j]
        lst[j] = temp


    def icbicsort(lst):
        swap_count = 0
        n = len(lst)
        for i in range(n):
            for j in range(n):
                if lst[i] < lst[j]:
                    swap_count += 1
                    list_swap(lst, i, j)
        return swap_count

    def bubble(lst):
        swap_count = 0
        n = len(lst)
        # preprocess
        # this is intentionally identical to icbicsort with i = 0
        for j in range(n):
            if lst[0] < lst[j]:
                swap_count += 1
                list_swap(lst, 0, j)
        # go forward, bubbling backward
        for i in range(1, n):
            for j in range(i, 0, -1):
                if lst[j] < lst[j-1]:
                    swap_count += 1
                    list_swap(lst, j-1, j)
        return swap_count

    def cmp(n=100):
        alst = []
        blst = []
        for t in range(n):
            value = randrange(65536)
            alst.append(value)
            blst.append(value)
        icb_swaps = icbicsort(alst)
        bubble_swaps = bubble(blst)
        print(f"icb: {icb_swaps} swaps; bubble: {bubble_swaps} swaps")
    
    # just for completeness
    def bubble_classic(lst):
        n = len(lst)
        sorted = False
        while not sorted:
            sorted = True
            for i in range(n-1):
                if lst[i] > lst[i+1]:
                    sorted = False
                    list_swap(lst, i, i+1)
    return
26 calls to cmp() produced identical swap counts in 24 cases and slight discrepancies (of 2 and 3 swaps, both favoring icbsort) in two cases. (For statistical purposes, this was a manual inspection where I stopped after getting the second discrepancy.)

If you're regularly seeing discrepancies, something is weird. If you're seeing them occasionally, I still think something is weird, but I'm prepared to believe that it's a bug in the code.

I will note, in support of the paper's actual thesis, that I wrote icbicsort() correctly on the first try, and I had to work many bugs out of bubble(). Maybe it's still bugged.