Hacker News new | ask | show | jobs
by sologoub 3126 days ago
Really wish "serverless" also meant that it can work with AWS Lambda efficiently. As is, each function would try to open a connection, making the overall overhead extremely high and stressing DBs.
3 comments

Can't you open the connection outside of the function?

So as long as the function is hot, it won't reconnect.

That's right. Make connections to databases (and most other things) the first time your Lambda handler runs, and stash them in a static/global variable for re-use on future runs. That allows you to amortize the cost of forming the connection over many executions of your function, which improves latency, reduces cost, and reduces load on the backend.
Well, amortize sounds a bit funny when the first user basically has to pay the whole cold run, hehe
Haven't heard of this approach yet. Do you have a write up I could reference to try it out?
I never read anything official, but some stuff by framework makers (serverless/apex up)

edit: https://medium.com/@tjholowaychuk/aws-lambda-lifecycle-and-i...

Thanks, but I think this is very different from connection pooling on a DB, say pdbouncer.

Doing some searching, I did find this that seems much closer: http://blog.rowanudell.com/database-connections-in-lambda/

TLDR; you can define the connection to DB outside of the scope of a given function, so it’s scoped to the container and can be reused so long as the container is not recycled. Seems promising!

That's basically what I wrote
Are you sure? That is, do you have specific knowledge of the implementation? Because:

> The endpoint is a simple proxy that routes your queries to a rapidly scaled fleet of database resources.

That doesn’t seem to preclude a multiplexing proxy a la PgBouncer.

I don't think it is, just wishing it was.
Is that true? The default limit on concurrent function executions is 1000. The existing Aurora (MySQL) should be able to handle cycling through those connections without issue.
It’s not that the DB servers can’t handle it, it’s that establishing a connection is slower than re-using an existing one.

You also forgo certain optimizations within the DB designed to make fetching things for the given connection/scope faster, such as temp tables.