Not spawning, forking. Web servers were simple “accept(2) then fork(2)” loops for a long time. This is, for example, how inetd(8) works. Later, servers like Apache were optimized to “prefork” (i.e. to maintain a set of idle processes waiting for work, that would exit after a single request.)
Long-running worker threads came a long time later, and were indeed intensely criticized from a security perspective at the time, given that they’d be one use-after-free away from exposing a previous user’s password to a new user. (FCGI/WSGI was criticized for the same reason, as compared to the “clean” fork+exec subprocess model of CGI.)
Note that in the context of longer-running connection-oriented protocols, servers are still built in the “accept(2) then fork(2)” model. Postgres forks a process for each connection, for example.
One lesser-thought-about benefit of the forking model, is that it allows the OS to “see” requests; and so to apply CPU/memory/IO quotas to them, that don’t leak over onto undue impacts on successive requests against the same worker. Also, the OOM killer will just kill a request, not the whole server.
PHP these days doesn't fork and spawn a new process, though it does create a new interpreter context.
In the old cgi-bin days, every web request would fork and exec a new script, whether PHP, Perl, C program, etc. That was replaced with Apache modules (or nsapi, etc), then later, with long running process pooled frameworks like fcgi, php-fpm, etc. Perl and PHP typically then didn't fork for every request. But did create a fresh interpreter context to be backward compatible, avoid memory leaks, etc. So there's still overhead, but not as heavy as fork/exec.
The (web) world used to be synchronous.
Traditional Apache spawns a number of threads and then keeps each thread around for x number or requests, after which the thread is killed and a new one spawned.
Incredibly useful feature when you're on limited hardware and want to ensure you don't memory leak yourself out of existence. Modern Apache has newer options (and of course nginx has traditionally been entirely async on multiple threads).
Killing a process is much safer than killing a thread, and the OS does cleanup.
It's not great for maximizing performance but it's not 100s of milliseconds either, forking doesn't take long; what is slow is scripting languages loading their runtimes, but you can fork after that's loaded. If hardware is cheaper than opportunity cost of adding new features (rather than debugging leaks) it makes sense.
Long-running worker threads came a long time later, and were indeed intensely criticized from a security perspective at the time, given that they’d be one use-after-free away from exposing a previous user’s password to a new user. (FCGI/WSGI was criticized for the same reason, as compared to the “clean” fork+exec subprocess model of CGI.)
Note that in the context of longer-running connection-oriented protocols, servers are still built in the “accept(2) then fork(2)” model. Postgres forks a process for each connection, for example.
One lesser-thought-about benefit of the forking model, is that it allows the OS to “see” requests; and so to apply CPU/memory/IO quotas to them, that don’t leak over onto undue impacts on successive requests against the same worker. Also, the OOM killer will just kill a request, not the whole server.