Hacker News new | ask | show | jobs
by not-my-account 1058 days ago
You end up having to do a lot of things in a ML training run, some of which you can do in parallel because it’s not important now (eg saving metadata) or because you’d otherwise be resource limited (eg loading data and formatting batches for training)
1 comments

And for this you cannot use Python's multiprocessing because ... ? Sure, moving data between processes is slow because of pickling [0]. However, I'm using parallel processing for the things you suggested, and for these it works great.

If I really had the use case and needed threads, I'd much rather use C++ bindings in a Python package than rebuilding the whole thing. Guess it depends on the scale we are talking about.

[0] https://pythonspeed.com/articles/faster-multiprocessing-pick...

It’s handling all the things that can go wrong when communicating and coordinating across processes, across machines, or troubleshooting bottlenecks on running systems that Elixir (and Erlang) excels.