Hacker News new | ask | show | jobs
by ldoughty 1488 days ago
AWS aurora serverless says:

> You pay only for the capacity your application consumes.

> Scales down to 0.5

But it actually can't scale down to 0.5 or the DB falls over just existing.. auto scaling won't let you go down that low unless you set 0.5 as the max, which literally makes it not scale up, and it's dead, because the DB can't run with that little CPU.

So it's fair to ask if neon can scale to 0, both in marketing, and in practice.

1 comments

We do scale compute part down to zero after 5 mins of inactivity now (no active transactions). This 5 mins threshold is a random pick, it could be 1 min or 30 mins later, or even customizable by the end-user. Storage part is heavily multi-tenant, so it's always running and our main objective is to make resource utilization as effective as possible.

It still has a significant latency on the first connection attempt after suspend (1-2 seconds), but we are working on that and it seems to be realistic to put the startup time under 1 sec.

Pricing model is still work-in-progress, so cannot say much about it. Yet, my personal intention is to make it cost-effective for both end-user and us. I'd prefer to don't build a service with claims like 'here is your free-tier serverless Postgres with zero-latency on connect', which actually means that under the hood there is an always-running compute burning the investors money. Hope it's realistic to achieve :)

-- Cloud engineer @ Neon

That's interesting to hear. That probably works great for my use cases, which is typically wake up to refresh a CDN for guests, but ready to work for a bit if a content creator logs in (e.g. a WordPress instance without comments or non-author logins).

Looking forward to seeing how this works out. I have no issues paying for services, I just hate that the minimum entry level cost is $20... I can't imagine why, at scale, it can't be more affordable for hobby/fun level projects.

How do you plan to start a PostgreSQL instance in less than 1 sec? Sounds interesting.

I tried fast booting of PostgreSQL instances and it always took multiple seconds. So i am really curious!

That's where the separation of storage and compute kicks in, I guess. Startup process of our Postgres instance (compute node) is a bit different from vanilla Postgres. We need to go to the network storage service (pageserver and safekeepers) to get the last known commit LSN, but we don't need to perform any sort of recovery on the compute node side. That way, compute is mostly stateless.

Basically, to start we need to know this LSN and to bootstrap the Postgres processes. This is really that quick. After that compute is ready to accept connections and serve requests, as it's able to get any missing pages from pageserver with GetPage@LSN request.

We do have the whole bunch of problems to solve: queries latency after cold start; startup after the unexpected exit of the heavily loaded Postgres instance could be slower; etc.

Some parts of the PostgreSQL start-up sequence take a long time:

- Initializing shared memory -> We, for now, have only small instances, so that doesn't hit us as hard

- Reading data directories -> We don't have to do that at all

- Replaying WAL from a previous unclean shutdown -> We don't need to do that, PageServer is responsible for that

- When initializing a whole new database: Initializing the data directory -> We have a copy that each instance gets initialized from, which makes the process "copy those ~16MB in the background", which saves us from having to do the costly initialization process.

And there's several more infrastructural optimizations, such as pre-loading the docker images onto the hosts.