Hacker News new | ask | show | jobs
by LtWorf 105 days ago
But python can fork itself and run multiple processes into one single container. Why would there be a need to run several containers to run several processes?

There's even the multiprocessing module in the stdlib to achieve this.

2 comments

Threads are cheap, you can do N work simultaneously with N threads in one process, without serialization, IPC or process creation overhead.

With multiprocessing, processes are expensive and work hogs each process. You must serialize data twice for IPC, that's expensive and time consuming.

You shouldn't have to break out multiple processes, for example, to do some simple pure-Python math in parallel. It doesn't make sense to use multiple processes for something like that because the actual work you want to do will be overwhelmed by the IPC overhead.

There are also limitations, only some data can be sent to and from multiple processes. Not all of your objects can be serialized for IPC.

I think you have a good point on IPC but process creation in Linux is almost as fast as thread creation

Unless the app would constantly be creating and killing processes then the process creation overhead would not be that much but IPC is killer

And also your types aren’t pickable or whatever and now you gotta change a lot of stuff to get it to work lol.

It makes sense to me that a program currently written using multiple processes would now be re-written to use multiple truly parallel threads. But it seems very odd to suggest (as your grandparent comment does) that a program currently run in multiple containers would likely be migrated to run on multiple threads.

In other words, I imagine anyone who cares about the overhead from serialization, IPC, or process creation would already be avoiding (as much as possible) using containers to scale in the first place.

Yeah, I somehow glossed over the whole container thing.

The container thing might be horizontal scaling thing where 1 container runs on 1 instance with 1 vCPU, running multiple processes on instances means you need beefier slices of compute to take advantage of the parallelism, and you can't cleanly scale up and then down using only the resources you need.

If you have a queue distributing work, that model makes sense with single-threaded interpreters where consumers instances are spun up and down as needed, versus pushing work to a thread pool, or multiple instances with their own thread pools, that aren't inhibited by the GIL. The latter could be more efficient depending on the work.

But… in python threads don't run in parallel, which is the whole problem we are working around here.
Forking and multi threading do not coexist. Even if one of your transitive dependencies decides to launch a thread that’s 99% idle, it becomes unsafe to fork.
Im curious as to the down votes on this. It's absolutely true, and when I was maintaining a job runner daemon that ran hundreds of thousands of who knows what Python tasks/jobs a day on some shared infra with arbitrary code for a certain megacorp from 2016-2020 or so, this was one of insidious and ugly failure modes to go debug and handle. The docs really make it sound like you can mix threading and multiprocessing but you can never really completely ensure that threading and then bare fork will ever be safe, period. It's really irritating that the docs would have you believe that this is OK or safe, but is in keeping with the Python philosophy of trying to hide the edge of the blade you're using until it's too late and you've cut the shit out of yourself.
I'm replying to a person that scales python by running several containers instead of 1 container with several python processes.
Why is it unsafe?
In general only the thread calling fork() gets forked, so unless you call exec() soon after, there are a lot of complications with signals, shared memory.
What are the complications? A single thread with its own process sandbox with everything from the parent is exactly what I'd expect coming from C land. Are the complications you refer to specific to the python VM or more general?
Even treating the process as read only after forking is potentially fraught. What if a background thread is mutating some data structure? When it forks the data structure might be internally inconsistent because the work to finish the mutation might not be completed. Imagine there are locks held by various threads when it dies, trying to lock those in the child might deadlock or even worse. There's tons of these types of gotchas.
Okay so just all the usual threading gotchas. Nothing specific to Python.

Conceptually fork "just" noncooperatively preempts and kills all other threads. Use accordingly. Yes it's a giant footgun but then so is all low level "unmanaged" concurrency.

If you have multiple threads, you almost certainly have mutexes. If your fork happens when a non-main thread holds a mutex, your main thread will never again be able to hold that mutex.

An imperfect solution is to require every mutex created to be accompanied by some pthread_atfork, but libraries don’t do that unless forking is specifically requested. In other words, if you don’t control the library you can’t fork.

Fork-then-thread works, does it not?
If you have enough discipline to make sure you only create threads after all the forking is done, then sure. But having such discipline is harder than just forbidding fork or forbidding threads in your program. It turns a careful analysis of timing and causality into just banning a few functions.
Can't you check what threads are active at the time you fork?
And what do you do with that information? Refuse to fork after you detect more than one thread running? I haven’t seen any code that gracefully handles the unable-to-fork scenario. When people write fork-based code, especially in Python, they always expect forking to succeed.
But not the reverse, if its a bare fork and not strictly using basically mutex and shared resource free code (which is hard), and there's little or no warning lights to indicate that this is a terrible idea that fails in really unpredictable and hard to debug ways.