Python's for loop implementation is slow, also. You can use built in utils like map() which are "native" and can be a lot faster than a for loop with a push:
Benchmarking methodology in the link is not good. Author should use timeit() or cProfiler or so. 0.01s of difference is mostly due to fluctuation. The order of execution also matters. Say you want to test A and B function, you need actually to run A, B, B, A to see if the ordering brings the different.
list(map(func, arr)) did bring 10% benefits if the func is builtin e.g. int(), str().
But if func is tuple(), list(), set() or any kind of user defined function, list(map()) is always slower.
You can try yourself to see list(map()) is not working well:
import numpy as np
a = np.arrange(100000, 100000)
%%timeit
b1 = [np.sum(x) for x in a]
# repeat once
%%timeit
b2 = list(map(np.sum, a))
# repeat once
import gc
gc.collect()
%%timeit
b2 = list(map(np.sum, a))
# repeat once
b1 = [np.sum(x) for x in a]
# repeat once
I guess that's why I only use map() if and only if is it the case 'list(map(itemgetter, arr))', because generally there is no benefit to use it.
I don’t know if numpy fixed this, but it used to be that mixing Python numbers with numpy in a tight loop is horribly slow. Try hoisting max_dist out of the loop and replacing it with max_dist_np that converts it to a numpy float once.
Its not the looping itself that is slow in the article you linked, its that every element is appended to the list.
If you use a list comprehension its even faster and it still loops over all elements of the list.
As you can see from the third last instruction, a listcomp does append individual elements to the list. What it doesn’t need to do is call a method to do so (let alone lookup the corresponding method).
No, AFAIK each for loop iteration appends and pops the stack in the interpreter, while map loops all entirely in the native implementation of the interpreter itself.
Benchmarking methodology in the link is not good. Author should use timeit() or cProfiler or so. 0.01s of difference is mostly due to fluctuation. The order of execution also matters. Say you want to test A and B function, you need actually to run A, B, B, A to see if the ordering brings the different.