Hacker News new | ask | show | jobs
by jeeeb 1893 days ago
It’s even slightly more subtle than that.

Python multiprocessing doesn’t use fork on Windows. It starts a new process and so shouldn’t be affected by this.

So to trigger this you need to have num_processes != 0 on your DataLoader and be running on a non-Windows platform.

2 comments

I get the desire to be pedantic, but does anyone at all train DL models on Windows? (barring toy projects for fun and perhaps debugging) The same can be said about num_workers > 0. You _have to_ fork worker threads unless you train something super tiny like MNIST and you load the whole dataset on GPU.
> does anyone at all train DL models on Windows?

Yes. My last job was at a financial shop that was all Windows. They were doing ML with Python on Windows. Azure has boxes available for this.

Starting with Python 3.8, multiprocessing will also use new processes by default on MacOS (due to some system libraries not being fork-safe).

IMHO cross-platform Python projects should call `multiprocessing.set_start_method('spawn')` to get the same behavior everywhere.