Yeah. Recently switched some Blender Python algorithms I wrote to Swift/Metal, and the speedup was somewhere between 1000 and 1000000 depending on the algorithm.
Yeah, properly written Python is at extreme worst O(1000) times slower than speeding it up code with a Numpy/Numba/c/Fortran/etc. implementation. Brute-force loopy code in Python I've seen is 100x slower than the compiled alternatives. So I agree, these extreme numbers are the sign of writing the worst possible Python implementation of a thing and saying Python sucks.
Who would have guessed that compiled, static, non-dynamic, hardware accelerated code would be a ton more performant than runtime, highly dynamic, garbage collected and very powerful code that is not hardware accelerated.