Hacker News new | ask | show | jobs
by hardwaresofton 2082 days ago
Don't worry, I don't think these lessons were forgotten -- we've made your way back full circle already with "serverless functions" if you squint a little bit.
3 comments

I think coming full circle is a sign that they were indeed forgotten - as in "Those who cannot remember the past are condemned to repeat it". Although the newer iterations of this idea are more managed and auto-scalable and such.
The fact that serverless functions are managed by someone else, including the auto scaling, is a big difference.
So was cgi-bin on shared hosting providers!
Including scaling? I definitely remember it as being single-host/single-server.
Let's dive in a little to how Serverless & FastCGI might be related.

The thing that fastcgi brought over cgi-bin was that an application process could be left open to communicate with the server, where-as cgi-bin model required spawning a new process for each request.

If one reads the AWS Lambda docs, they'll see the execution context[1] has a similar behavior. AWS will spin up new instances, but these instances will serve multiple requests, via a fairly custom "function" interface defined for various runtimes (but which is actually, typically an http interface). There is a standard HTTP api for runtimes to use to retrieve function invocations[2].

With FastCGI the front end server uses a socket to push request messages to app servers, which replies in order. Where-as with Lambda & it's above mentioned runtime API, the runtime is retrieving requests from Amazon at it's pacing, & fulfilling them as it can. So there's a push vs pull model, but in both cases, the application server is talking a fairly custom protocol to the front-end server.

Also though, there are some cgi-bin like behaviors seen in some serverless systems. Serverless is a big umbrella with a lot of different implementation strategies. One optimization is use of checkpoint-restore. With checkpoint restore, an app server is brought up to a "ready to serve" state, then the host operating system takes a "snapshot" of the process. When new instances of the process are needed, the serverless system can "restore" this memory mapped process & the resources it was using, bringing it up in a ready-to-serve state quickly. This behavior is more cgi-bin like, in that it's a technique for spawning new serving processes quickly, although few serverless systems go as far as cgi-bin went with a per-request process. None-the-less, openwhisk for example has was showing off start times decreasing from 0.9-0.5s for node, python, java app servers down to .09s-0.7s startup times using these checkpoint restore capabilities of the OS.

[1] https://docs.aws.amazon.com/lambda/latest/dg/runtimes-contex...

[2] https://docs.aws.amazon.com/en_us/lambda/latest/dg/runtimes-...

[3] https://events19.linuxfoundation.org/wp-content/uploads/2017...

And this is where the squinting comes in! As far as I'm concerned the differences are mostly implementation details. As far as the idea of deploying small, relatively self-contained chunks of functionality I think serverless is a revisit of the principles of CGI. I've actually given at tiny presentation on this if anyone likes to read my view in depth[0] which I think fully explains my stance. Huge grain of salt though -- I wasn't there when all this happened, and I only caught the end of the cgi-bin era.

> The thing that fastcgi brought over cgi-bin was that an application process could be left open to communicate with the server, where-as cgi-bin model required spawning a new process for each request.

Absolutely (I read/skimmed the article, and this was brought up) -- serverless functions are the same here, because it also allows for machines to stay running for some indefinite

> If one reads the AWS Lambda docs, they'll see the execution context[1] has a similar behavior. AWS will spin up new instances, but these instances will serve multiple requests, via a fairly custom "function" interface defined for various runtimes (but which is actually, typically an http interface). There is a standard HTTP api for runtimes to use to retrieve function invocations[2].

> With FastCGI the front end server uses a socket to push request messages to app servers, which replies in order. Where-as with Lambda & it's above mentioned runtime API, the runtime is retrieving requests from Amazon at it's pacing, & fulfilling them as it can. So there's a push vs pull model, but in both cases, the application server is talking a fairly custom protocol to the front-end server.

This is one of the reasons I said "serverless functions" instead of Lambda. While AWS happens to specify their operational semantics that way, there is no need for anyone else to. While improvements are necessary I still think this is fairly close to FCGI, and FCGI could absolutely serve as a "serverless" provider implementation.

Needless to say, how long you keep around the process, or how you checkpoint restore (criu[1] is very interesting, for anyone who's never seen it) and move processes are all implementation-specific in my mind.

[0]: https://gitlab.com/mrman/talks/raw/master/dist/2019/04/merca...

[1]: https://criu.org/Main_Page