Hacker News new | ask | show | jobs
by zb 2050 days ago
> Python's "threads" are actually fibers.

They’re actually not. They are native threads with high lock contention.

Async is arguably fibers, as are greenthreads in libraries like gevent or eventlet.

> But if you want to use real threads, multiprocessing's "processes" are actually system threads.

They’re system threads running in separate memory spaces. Also known as… processes.

1 comments

You're right. To me they just feel like fibers because they can't run in parallel.
If you use numba (or cython, c extensions, etc) you can make them run without requiring that they hold the GIL, and they can run in parallel. Here's an example that should keep a CPU pegged at 100% utilization for a while:

  import numba as nb
  from concurrent.futures import ThreadPoolExecutor
  from multiprocessing import cpu_count

  @nb.jit(nogil=True)
  def slow_calculation(x):
      out = 0
      for i in range(x):
          out += i**0.01
      return out

  ex = ThreadPoolExecutor(max_workers=cpu_count())
  futures = [ex.submit(slow_calculation, 100_000_000_000+i) for i in range(cpu_count())]
> and they can run in parallel.

Even without requiring the GIL, these are still child threads of the main process, correct? And because of that, wouldn't the OS keep them all on the same core? And if that's the case, would ProcessPoolExecutor solve that problem?

I had no idea that existed, thank you!