Hacker News new | ask | show | jobs
by harryvederci 1615 days ago
Interesting CGI content linked on there.

I've been reading about / hacking on CGI recently, and it's been kinda fun!

Question: One thing I keep reading is how inefficient it is to start a new process for each incoming connection. Could someone explain to me why that's such a bottleneck? I imagine it being an issue back when CGI was used everywhere, people moving away from CGI, and forgetting about it. But hasn't there been improvements in the meantime? Computers from today can run circles around those from a few decades back. Has everything improved except the speed / efficiency of starting a new process?

(I don't have a computer science background, but I guess you could already tell from the above.)

4 comments

> Interesting CGI content linked on there.

>

>I've been reading about / hacking on CGI recently, and it's been kinda fun!

>

>Question: One thing I keep reading is how inefficient it is to start a new process for each incoming connection. Could someone explain to me why that's such a bottleneck? I imagine it being an issue back when CGI was used everywhere, people moving away from CGI, and forgetting about it. But hasn't there been improvements in the meantime? Computers from today can run circles around those from a few decades back. Has everything improved except the speed / efficiency of starting a new process?

>

It's not as bad as you think it is; just change the webserver to pre-fork. From this link[1], and the nice summary table in this link[2] - I note the following:

1. pre-forked servers perform very consistently (the variation before being overwhelmed) and appears at a glance to only be less consistent than epoll.

2. For up to 2000 concurrent requests, the pre-forked server performed either within a negligible margin against the best performer, or was the best performer itself.

3. The threaded solution had the best graceful degradation; if a script was monitoring the ratio of successfull responses, it would know well beforehand that an imminent failure was coming.

4. The epoll solution is objectively the best, providing both graceful degradation as well as managing to keep up with 15k concurrent requests without complete failure.

With all of the above said, it seems that using CGI with a pre-forked server is the second best option you can choose.

I suppose that you then only have to factor in the execution of the CGI program (don't use Java, C#, Perl, Python, Ruby, etc - very slow startup times).

[1] https://unixism.net/2019/04/linux-applications-performance-i...

[2] https://unixism.net/2019/04/linux-applications-performance-p... 1.

Careful, pre-fork as described in the given link as worker processes each handling many requests. This result therefore does not answer the question about the cost of one process per request. The one that does seems to be fork, which is way less efficient (~460 seems like a low number of processes spawned per second though, can we really not do more?).
I'll read those articles you shared, thanks!

Currently the CGI stuff I'm working on is to run stuff on a cheap shared host, so I'll have to check which category of servers that Apache falls in.

Once an application I'm running on a shared host becomes successful enough, I'm probably going to want to move to a different environment, but I'm still interested in what that would mean for performance :)

> Once an application I'm running on a shared host becomes successful enough, I'm probably going to want to move to a different environment, but I'm still interested in what that would mean for performance :)

Depending on what you are doing and what language you are using, a $5/m DO droplet might be sufficient. I once ran a single multi-threaded server, serving a simple binary protocol, and over a 2 day period it handled sustained loads of up to 30k concurrent connections.

To get it that high, I had to up the file descriptor limit on that host.

It's not just the start-up and shut-down costs. A CGI process might need to attain connections to databases or other resources that could be pooled and re-used if the process didn't completely terminate.

You might want to look at using FastCGI:

https://en.wikipedia.org/wiki/FastCGI

Basically, the CGI processes stay alive and the servers supporting FastCGI ( like Apache and nginx ) communicate with an existing FastCGI process that's waiting for more work, if available.

Thanks! That's a good point, about re-using connections.

For my current use-case* that wouldn't be an issue, so CGI could probably be OK there, then!

* A side project that uses SQLite (1 file per user), and no other external resources.

I’m smiling at your question!

Yes, it’s less efficient than having a persistent server, but as all things are, it exists in a spectrum.

The load time for one of these processes is going to be almost trivial. I’m on mobile right now, but I would guess that it would be in a handful of milliseconds, especially when the binary is already in cache (due to other requests).

But if you want to compare this against a lot of the prevailing systems, it’ll still probably win on single request efficiency. Network hops, for example, are frequently quite slow and, if efficiency is your primary metric, should be avoided as much as possible. Things like Serverless go the opposite way and tore both your incoming request through a complex set of hops, and also your backend database requests.

Thanks for your response!

I guess I should do some benchmarks comparing different technologies.

> Things like Serverless go the opposite way and tore both your incoming request through a complex set of hops, and also your backend database requests.

I didn't know about that, thanks. If you know some good resources on the topic, feel free to put them in a reply to this message!

https://www.johndcook.com/blog/2011/01/12/how-long-computer-... is a decent place to start for thinking about how different timings work for things. It's a bit on the stale side, some things have gotten much faster (e.g. disk "seeks" are dramatically different with NVMe), but a lot of it has stayed similar, and some will never change (packet timing to Europe has a speed-of-light limit for now)
Time a python program that imports a few things and then immediately exits. It's significantly more CPU time than you might think. If you use a language with fast startup times, preforking CGI servers can be quite fast.