| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by skissane 3443 days ago

> > Another issue is that fork-and-exec doesn't work well with languages with complicated runtimes

> How are you doing fork-and-exec in a language with a large runtime? You are either using the language-provided APIs to do it, in which case they should document the restrictions on what you can call (and you should follow those), or you are dipping down into the C or system call layer to do your own fork-and-exec, in which case yeah, you still need to keep to the safe list of routines you can call between fork and exec, and you may have extra limitations since you are mucking around underneath your language's runtime (like you may have to unignore signals on your own, close file descriptors, etc). No surprises there.

Let's say I am using JNA – https://github.com/java-native-access/jna – under Java. It is safe to call posix_spawn from Java code using JNA. It is safe to call the Windows API equivalent (CreateProcess). It would be safe to call the handle/descriptor-based API I proposed. It is not safe to call fork. This is an undeniable deficiency of the fork-exec approach which competing approaches don't have. Furthermore, whatever compensating advantages fork-exec may have, the handle/descriptor-based API I proposed has the same advantages without this disadvantage.

> > Another issue is that it is very hard to implement robust error handling without race conditions in the fork-exec model.

> I don't think it is. You just print an error to stderr (write() is safe to call), and you return a bad error code (fork has built-in IPC for error codes via wait() in the parent).

But that isn't robust. How can the parent process reliably distinguish output sent by the child process prior to the exec from output sent by the child process post the exec? Likewise, how can the parent process reliably distinguish an error return value from the child process prior to the exec from an error return value from the exec'd program? It can't.

For truly robust error handling, you'd actually need to do something like this: (1) have a pipe between parent and child process with FD_CLOEXEC set on the child side; (2) the child sends the parent a message "I'm about to exec" before calling exec; (3) the child sends the parent a message saying "exec failed with errno=.." if the exec call fails; (4) if the exec call succeeds, the child process will close its end of the pipe without sending any message post "I'm about to exec". This is my point, actually robustly handling errors in the fork-exec model is quite complex. In a handle/descriptor based API it would be much simpler.

(And the above approach using a pipe isn't perfectly robust – what if the child process crashes for some reason between sending the "I'm about to exec" message and actually calling exec()? It is very difficult for the parent process to reliably distinguish that scenario from some failure in the program being exec()'d.)

1 comments

jjnoakes 3443 days ago

> Let's say I am using JNA. [...] It is not safe to call fork.

Are you calling fork() from Java, from C, or using the system call number?

Because I'd agree calling it from Java might be unsafe (depends on how Java and JNA interact), but I believe calling it from C or the system call is perfectly fine. And this is in line with what I've written previously.

> But that isn't robust.

It's not supposed to be robust in the way you are describing.

The fork-exec model is low level. It is supposed to be low level. Doing high level things with it is supposed to take some work by the application. That's not a deficiency.

If you build too many things into the low level code, you run into trouble because now you've got 10x as many ways to fail (building your pipes, writing your error messages, marshalling error state, cleaning up, you name it).

Also, some programs will want to do some of those higher level things differently, so instead of baking them into the API and having tons of parameters and paying for some of that overhead (like creating a pipe and writing error messages to the parent for every single fork and exec) you only do that when you want it.