| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by winrid 1172 days ago
	Python's for loop implementation is slow, also. You can use built in utils like map() which are "native" and can be a lot faster than a for loop with a push: https://levelup.gitconnected.com/python-performance-showdown...

3 comments

akasakahakada 1172 days ago

Nope. Map() is same speed as for loop.

Benchmarking methodology in the link is not good. Author should use timeit() or cProfiler or so. 0.01s of difference is mostly due to fluctuation. The order of execution also matters. Say you want to test A and B function, you need actually to run A, B, B, A to see if the ordering brings the different.

link

winrid 1172 days ago

Yeah I guess this isn't true anymore, it looks like maybe it was true in 2.6 days.

link

akasakahakada 1172 days ago

I immediately verified both claims.

list(map(func, arr)) did bring 10% benefits if the func is builtin e.g. int(), str().

But if func is tuple(), list(), set() or any kind of user defined function, list(map()) is always slower.

You can try yourself to see list(map()) is not working well:

    import numpy as np
    a = np.arrange(100000, 100000)
    %%timeit
    b1 = [np.sum(x) for x in a]
    # repeat once
    %%timeit
    b2 = list(map(np.sum, a))
    # repeat once
    import gc
    gc.collect()
    %%timeit
    b2 = list(map(np.sum, a))
    # repeat once
    b1 = [np.sum(x) for x in a]
    # repeat once

I guess that's why I only use map() if and only if is it the case 'list(map(itemgetter, arr))', because generally there is no benefit to use it.

link

winrid 1171 days ago

thanks!

link

FreeHugs 1172 days ago

I don't think it's the loop implementation. The stuff in the loop should take multiple orders of magnitude more time than the loop itself:

    for poly in polygon_subset:
        if np.linalg.norm(poly.center - point) < max_dist:
            close_polygons.append(poly)

link

zarzavat 1172 days ago

I don’t know if numpy fixed this, but it used to be that mixing Python numbers with numpy in a tight loop is horribly slow. Try hoisting max_dist out of the loop and replacing it with max_dist_np that converts it to a numpy float once.

link

akasakahakada 1172 days ago

Speaking of this, I once find that

    for x in numpy.array:

is 9X slower than

    for x in numpy.array.tolist():

in 2021.

link

tweakimp 1172 days ago

Its not the looping itself that is slow in the article you linked, its that every element is appended to the list. If you use a list comprehension its even faster and it still loops over all elements of the list.

link

masklinn 1172 days ago

Here is the decompilation of the listcomp

    [x for x in range(5)]

    RESUME 0
    BUILD_LIST
    LOAD_FAST
    FOR_ITER 4
    STORE_FAST (x)
    LOAD_FAST (x)
    LIST_APPEND
    JUMP_BACKWARDS 5
    RETURN_VALUE

As you can see from the third last instruction, a listcomp does append individual elements to the list. What it doesn’t need to do is call a method to do so (let alone lookup the corresponding method).

link

winrid 1172 days ago

No, AFAIK each for loop iteration appends and pops the stack in the interpreter, while map loops all entirely in the native implementation of the interpreter itself.

link