Hacker News new | ask | show | jobs
by tpetry 1487 days ago
How do you plan to start a PostgreSQL instance in less than 1 sec? Sounds interesting.

I tried fast booting of PostgreSQL instances and it always took multiple seconds. So i am really curious!

2 comments

That's where the separation of storage and compute kicks in, I guess. Startup process of our Postgres instance (compute node) is a bit different from vanilla Postgres. We need to go to the network storage service (pageserver and safekeepers) to get the last known commit LSN, but we don't need to perform any sort of recovery on the compute node side. That way, compute is mostly stateless.

Basically, to start we need to know this LSN and to bootstrap the Postgres processes. This is really that quick. After that compute is ready to accept connections and serve requests, as it's able to get any missing pages from pageserver with GetPage@LSN request.

We do have the whole bunch of problems to solve: queries latency after cold start; startup after the unexpected exit of the heavily loaded Postgres instance could be slower; etc.

Some parts of the PostgreSQL start-up sequence take a long time:

- Initializing shared memory -> We, for now, have only small instances, so that doesn't hit us as hard

- Reading data directories -> We don't have to do that at all

- Replaying WAL from a previous unclean shutdown -> We don't need to do that, PageServer is responsible for that

- When initializing a whole new database: Initializing the data directory -> We have a copy that each instance gets initialized from, which makes the process "copy those ~16MB in the background", which saves us from having to do the costly initialization process.

And there's several more infrastructural optimizations, such as pre-loading the docker images onto the hosts.