from multiprocessing import Pool
p = Pool(4) # 4 worker threads
p.map(somefunc, someiter)
Not sure if this is apples-to-apples. Regardless, I'm always skeptical of the behavior of threaded code, even if it's embarrassingly parallel stuff that is divided among workers.
Unfortunately the multiprocessing module is a lot less efficient in many cases. Scala doesn't have to copy the entire collection across process boundaries or launch the runtime multiple times. (In this particular rather trivial example it doesn't matter much because an Int range is used). I'm afraid Python is losing out in the parallel world due to the GIL as more CPU cores become the norm.