Hacker News new | ask | show | jobs
by arghwhat 3067 days ago
1. No, the point was that fork(2)+exec(3)/spawning processes is an expensive way to run code, not how long it takes for the parent to be able to do something else.

2. Your new benchmark is better. However, it is still a useless microbenchmark, as it is an unrealistic best-case scenario. Your spawn of sleep is happening within a fresh subshell started by the pipe you made. fork(2) depends on things like VMM size and open file descriptors of the parent process, and your subshell basically has nothing at all. A real application likely holds at least a few gigabytes of virtual memory (more likely tens of gigabytes—note that virtual memory isn't the same as resident memory), which will make fork(2) take much longer, split between parent and child.

I suspect you might be confusing asynchronicity with concurrency or parallelism. Go is heralded for concurrency, sometimes in the form of parallelism, but not asynchronicity. Concurrency does not have any positive effect on execution time or cost. Parallelism can reduce execution time, but does not decrease execution cost, it simply throws more hardware at the problem.

In fact, Go is a worse-than-average language to call fork(2) in, due to it running fork(2) under a global lock. This is mentioned in the linked article. The lock contention caused by fork(2) execution time as memory consumption increased was what made the process unresponsive.

However, as I also said, whether fork is too expensive depends on the use-case.

1 comments

> Each Gitaly server instance was fork/exec'ing Git processes about 20 times per second

> What's really wrong here is that they're apparently spawning processes like crazy.

Sounds like it depends on the use-case, rather then blanket "two dozen processes per second is clearly absurd".

Definitely. While fork(2) is expensive, a price is useless without also knowing the budget, and how expensive it is depends on the environment.

However, the problem in the posted article was indeed that spawning Git processes 20 times a second in that specific Go application was too much, and the fix was that Go replaced fork(2) with posix_spawn(3).