Improved VPC Networking for AWS Lambda | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Improved VPC Networking for AWS Lambda (aws.amazon.com)
	158 points by joaofs 2477 days ago

8 comments

gazzini 2477 days ago

This is huge for Lambda. It allows devs to create “serverless” apps [1], with relational databases, without 10+ second cold-start times. In the article, they measure it as 988ms.

I have tried building an API using API Gateway <-> Lambda, but had to choose between using DynamoDB to store data (no-SQL, so challenging to query) or suffering unacceptably long response times whenever a request happens to cause a cold-start. Theoretically, this problem is now going away!

[1] https://serverless-stack.com

paulddraper 2477 days ago

*It allows devs to create those apps _within a VPC_.

You could always have fast startup with Lambda + database outside the VPC.

jabart 2477 days ago

Which is how most breach announcements start

"A database server was found with an open port exposed to the internet and no or poor authentication, all records were exposed."

This also should mean that Lambda's can get stable public IPs through a VPC for firewalls as well.

*edit for must to most.

fulafel 2476 days ago

But VPC is not an especially efficient additional "defense-in-depth" layer against this kind of "fucked up both firewall and password" configuration mistake. The first 2 obvious ones are passwords, network-level firewalling, host-level firewalling of course, and after that you can add monitoring / port scanning for all your "must be firewalled" services. And you can mandate better-than-passwords authentication methods[1]. Etc. The latter is better because it is more general and doesn't add costly complexity to your networking topology (by way of NAT and/or ambiguous rfc1918 addressing)

[1] For example https://www.postgresql.org/docs/current/auth-cert.html or https://aws.amazon.com/premiumsupport/knowledge-center/users...

Niksko 2476 days ago

You mention defense in depth, but then immediately decide that an extra layer of defense is unnecessary.

fulafel 2476 days ago

Are you proposing that by acknowledging defense-in-depth, consistency dictates that one should pile up as many layers per attack vector as possible? Maybe, if you have infinite resources and don't need to make compromises on where you spend effort and resources in your risk management plan. But that's rarely the case in the real world.

gazzini 2477 days ago

You raise a fair point, this was possible, although it seems safe to say it would be a compromise on security.

I think it’s best not to expose the DB to outside connections in general, although it is still possible [1] when using RDS instances.

I think this is different for things like DynamoDB because, instead of a standard SQL-like db “connection”, they use AWS role-based auth for each request.

Of course, one could always configure some type of proxy service between the lambda and the DB... but that seems antithetical to going “serverless” in the first place.

[1] https://stackoverflow.com/questions/45227397/publicly-access...

Edit: I thought it was not possible to expose an RDS instance outside of a VPC, but I was wrong (you can place it in a public subnet, linked in [1]).

k__ 2477 days ago

Also, wasn't Aurora Serverless created because of that problem?

gazzini 2477 days ago

I think Aurora Serverless has even worse [1] cold-start times (for the DB itself), and it was intended as more of a price-optimization than a performance boost.

[1] https://forums.aws.amazon.com/thread.jspa?threadID=288043

nostrebored 2477 days ago

Aurora Serverless also handles connections. The problem of having a burst of 1000 concurrent invocations accessing your databases still exists even with VPC access

reilly3000 2477 days ago

That limit can be raised, apparently. I've seen mention of limits up to 30K concurrent invocations.

danb232 2477 days ago

is that a good tutorial? looks really good on the surface!

mvanbaak 2476 days ago

If you put an event bus in the middle (kinesis) your api-lambda functions don't need direct access to your RDS. Subscribe lambda functions to your kinesis stream, and let them handle the link to your RDS. This way you wont notice the cold starts.

meekins 2476 days ago

This comment doesn't seem to make sense, could you elaborate a bit? How would you replace the database working as a persistence layer to an API application by polling an event stream?

dabeeeenster 2476 days ago

It doesnt replace the DB. It just uses Kineses to be the messaging provider from the lambda to the DB and back. Not sure that's a great idea TBH but who knows?!

scarface74 2476 days ago

And then when you need to read the database?

jmb12686 2477 days ago

AWS announced this enhancement at 2018 re:Invent. It was slated for "sometime in 2019". I was excited, and I'm impressed that they released the feature well ahead of the end of the year (and before the next conference, which would obviously raise a few questions)

scarface74 2476 days ago

They did something similar with drift detection and cloud formation. They announced it at reInvent 2017 and released it one week before reInvent 2018.

slovenlyrobot 2477 days ago

This has been a /major/ sore point for Lambda use, amazing they fixed it, and always great to see they've documented the intense engineering requirements involved to make it happen.

AWS is a beautiful mix of business and technology, it's very rare to see such a large engineering-driven organization managing to balance customer friendliness. I'm an unashamed fanboy

k__ 2477 days ago

Major is a bit harsh.

As far as I know this was only an issue for legacy architectures.

scarface74 2477 days ago

No. Using an RDMS instead of DynamoDB is not a “legacy” architecture. You also shouldn’t expose your database publicly.

k__ 2476 days ago

RDMS is not legacy, but perimeter security certainly is.

scarface74 2476 days ago

I’m one of the harshest critics of “lift and shifters” - old school net ops people who get one certificates by watching an ACloudGuru video, duplicate their on prem infrastructure and processes to the cloud and don’t go all in on the advantages of it and end up costing their clients more - but nowhere is it considered “legacy” to not use perimeter security.

jimmychangas 2476 days ago

Honest question: what, in you opinion, is the state-of-the-art approach? Something like BeyondCorp?

k__ 2476 days ago

I think zero-trust goes into a good direction.

https://www.securityroundtable.org/zero-trust-approach-can-m...

slovenlyrobot 2477 days ago

There is an entire ecosystem of tooling that will shit itself and wake up half the company if you assign a public IP address in the wrong VPC

Stuff like this is pain in the ass, it was a major problem

k__ 2476 days ago

https://www.securityroundtable.org/security-without-boundari...

ajoy 2477 days ago

This solves one part of the cold start problem. Starting the container and loading the image on to it is still going to cause some latency.

nostrebored 2477 days ago

Solves might be strong, but it removes a big portion of the cold start latency that was difficult to optimize for and out of the control of developers. Creating minimal images isn't difficult for a number of environments (e.g. webpacking your node.js lambdas) and barring necessarily large images (think pandas on Lambda) this puts a lot of control for the cold start p99 back in the hands of customers.

Overall, definitely a big win!

k__ 2477 days ago

I found it a bit strange that they sold Lambda as THE new way to do API development.

You can connect API-Gateway with other services via Velocity templates, which don't have cold starts.

AppSync also doesn't suffer from cold starts.

Both are also serverless services.

Lambda is good if the other solutions are missing something, so you can drop it in quickly, but I wouldn't use it as the go to services for that...

Scarbutt 2477 days ago

API-Gateway can return HTML?

k__ 2477 days ago

Sure.

You can write Velocity templates for integration responses.

Normally they are JSON because that's what all the AWS services return and API-Gateway just passes them along.

But you could write something like this:

    #set($pets = $input.path('$'))
    <html lang="en">
      <head>
        <meta charset="utf-8">
        <title>Pets</title>
      </head>
      <body>
        <table>
        <th>ID</th>
        <th>Type</th>
        <th>Price</th>
            #foreach($pet in $pets)
                <tr>
                    <td>$pet.id</td>
                    <td>$pet.type</td>
                    <td>$pet.price</td>
                </tr>
            #end
        </table>
      </body>
    </html>

StreamBright 2477 days ago

Which can be mitigated by invoking your own Lambda functions once every minute or 5 minutes. Usually does not blow the budget.

nostrebored 2477 days ago

Warming functions in the previous VPC architecture was always a questionable practice. You had no guarantee that your environments would be warm across all subnets or which subnets would handle incoming requests. Beyond that, what happens to requests which you receive when the function is being warmed? You still incur cold starts.

There has never been a guarantee of environment reuse. Any architecture which isn't capable of incurring cold starts is not a good fit for serverless.

scarface74 2477 days ago

Which is a horrible idea....

How many lambdas do you keep warm? 5, 10, 20? Every new connection is a new lambda instance. You're still just delaying the inevitable.

Just use Fargate if you want to stay serverless and don't want the cold start times -- well at least before today.

StreamBright 2476 days ago

Sorry but it does not matter how many since everything is automated and you create the warm up scheduler when you create the function. As other pointed out in this thread that are other challenges with this approach.

>> Just use Fargate

We were trying to and we decided that is not our cup of tea. Lambdas are.

scarface74 2476 days ago

Yes it does matter. In your scheduler, how do you ensure your ping (the way you start an instance) is actually creating another instance to keep warm or reusing another instance?

If you want to always keep 20 instances warm, you have to keep the first ping active until the 20th one is done.

In other words, if you want to keep 20 active instances warm and you send 20 requests in 5 seconds, if each request only takes .25 seconds. You will only have 5 warm lambdas. The 6th real concurrent connection will still have a cold start. Also while you are pinging the request to keep it warm, that instance can serve a real user.

Also, API Gateway has an algorithm to decide whether to launch a new lambda are cache a request hoping that using an already warm lambda will free up.

jfbaro 2477 days ago

Wow! That's great. Cold starts are no longer a show stopper! Rust powered APIs running on AWS .. It sounds really exciting

reilly3000 2477 days ago

This is great news, but I'm bummed they didn't bundle the NAT gateway with this service. In a typical function that calls out to get data from a service and reads/writes from a DB in a VPC, that requires the somewhat painful configuration of a NAT gateway and dedicated subnets, as well as a $36/month bill for the NAT gateway service.

There are some workarounds that using multiple lambdas, but they have their own gotchas.

Still, hooray, this is good news. The Data API is great for Serverless Aurora, but I can't use that with BI tools.

abhorrence 2476 days ago

You can run your own gateway instance(s) for a lot cheaper than the nat gateway service. There are definitely some tradeoffs, but if $36/mo is an issue, they can be worthwhile: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Ins...

scarface74 2476 days ago

This is not meant to be a criticism of AWS, I’m an AWS true believer, but the main purpose of going to AWS is to make the “undifferentiated heavy lifting” someone else’s problem not to save money.

Going to AWS to save money on resources is about like going to the Apple Store to buy a cheap laptop.

reilly3000 2476 days ago

I’m not against AWS or $36/mo. It just is kinda a drag when the promise of serverless is pay per user and scaling to zero. You could get a nice EC2 t3.medium and do a lot more RPS for the cost of that NAT and Lambda invocations.

scarface74 2476 days ago

If you don’t care about cold starts, there is always Aurora Serverless with the Data API. I don’t believe it requires either a NAT or for the lambda to be attached to your VPC.

EwanToo 2477 days ago

This is a great improvement for Lambda users, much reduced cold start times!

paulddraper 2477 days ago

Iconoclast view ahead (change my mind please):

AWS does tons of stuff around VPCs....I feel like they really want me to use them (or their customers really want to use them), but I just don't see why.

I just run RDS on the internet. I don't have to muck with the complexity or cost of NATs or peering or Lambda slow start or any other weird networking issues.

I know it's "public", but that seems irrelevant in the era of cloud services. This isn't any different than, say, how Firebase or a million other services run. Should I be concerned that my Firebase apps are insecure because someone isn't overlaying a 10.* network on them?

EDIT: I should clarify that I understand the legitimacy of security groups, especially for technologies that weren't meant to operate outside a firewall. But that's mostly a different subject; AWS had security groups years before VPCs and subnets and NATs.

saurik 2477 days ago

So making the actual listening port for a database server "public" is generally a bad idea as that is another attack surface of code that honestly is hardly ever made public... but if when you say "public" you mean you are using security groups (which are super trivial to use and easy to understand) to define which other AWS devices can access the port, then yeah: I have never seen any reason why this entire feature should exist and the concept of having to think about IP address ranges as if they somehow matter is one of the things I was escaping when I moved to cloud in the first place, and somehow they wanted to reintroduce it? Why?!? It doesn't even work well (!!), and introduces tons of latency into everything it touches (not just Lambda) :/.

scoopertrooper 2476 days ago

VPCs are very helpful for when you have a large number of developers working in an AWS environment. It'd be oh so easy for a developer to accidently change a bit of terraform and expose your database to the internet without VPCs.

paulddraper 2477 days ago

My theory is that a bunch of entrenched network engineers just really like subnets and IPv4 and NAT and don't realize how mostly unnecessary it is in an era of cloud infrastructure and IPv6.

My grandchildren are still going to be NAT'ing.

StreamBright 2477 days ago

I think it is easy to have dev, qa and prod VPCs. Without VPC these separate infrastructure groups might be harder to split out. I usually reference security groups instead of subnets in security groups, avoiding referencing IP ranges (v4 or v6) entirely.

watermelon0 2477 days ago

You can also use multiple AWS accounts to separate those environments, which also eases user management (usually you have different people with access to each environment, with some overlap).

This also means that developers can have close to admin privileges, since the worst they can do, is to disrupt work of another developer, without affecting either QA or production.

WaxProlix 2477 days ago

Accounts are the correct level to separate these at. Keeps credentials easier to manage for devs, techs, etc, and limits blast radius if unauthorized accesses take place.

scarface74 2477 days ago

Which is really an old school way of doing it. Having multiple VPC's doesn't get around account limits, resources with the same name (sns topics, queues, stacks, etc.)

Just use separate accounts in an Organization.

You can give your developers almost complete unrestricted access to your dev account.

StreamBright 2476 days ago

I work with both style of AWS installations. Having organization is a constant pain for people who access multiple accounts even with something like Okta. 1 browser can access 1 account and if you switch you have to go through the switching process, or use multiple browsers. Quite often people would like to access cross account resources which a whole different level of discomfort. This is why I still think, old school or not, that having a single account and multiple VPCs is a better option.

fulafel 2476 days ago

This is true, AWS is pretty anti-internet in all their architecture recommendations. IMO security is better done by firewalling and protocol level authentication (belt + suspenders) because it keeps your configuration clean and understandable, and complexity is the enemy of security.

The attitude has two things in AWS interest: 1) keep lock-in by encouraging customers to build AWS-internal networks 2) don't scare away the lift-and-shift customers who want to transplant their 1990s style "intranet" (or mental model, at least) onto AWS.

Explains also why they aren't very keen about IPv6 because that would encourage internetworking.

Just don't tell anyone that you can access the AWS console from the internet :)

scarface74 2476 days ago

It’s never been considered best practice to expose services needlessly to the Internet. I’m as far from an old school net ops guy as you can get and jump at any new AWS technology that’s feasible as anyone but it would be the height of stupidity for me to expose my Aurora cluster to the Internet. Good luck explaining that to your external auditors.

fulafel 2476 days ago

Of course. I'm just saying that firewalling and end-to-end security are better ways of doing that than routing and ambiguous (rfc1918) addressing. Never trust the network, lest you end up making yours soft and chewy on the inside.

scarface74 2476 days ago

How do you propose you firewall your database access and only allow certain IP addresses when you need access from lambda when the lambda is always run from a random location on AWS’s network?

A lambda is never run “from within your VPC”, it’s attached via an ENI (or at least it was).

fulafel 2476 days ago

Yeah, this kind of thing is part of what I meant when I criticised AWS encouraging VPC use instead of end-to-end security.

But off the top of my head, you could always use the firewall API from the lambda to open network access between it and the RDS when the lambda starts. (In addition to using certs or IAM security on your TLS connection to the RDS db)

enitihas 2477 days ago

VPCs are very useful when running things like elasticache though( memcache and redis), because AFAIK those don't have an authentication ecosystem so making them public would be a terrible idea.

paulddraper 2477 days ago

Memcache has had reliable authentication (SASL) for some time. Redis has authentication meant to be a secondary protection.

But that's a good point.

I suppose all the services I use already have security models (usually more complex, multi-user ones, so agent X can read but not modify, etc.).

HOWEVER...this could be solved with security groups, but it seems that's not the model AWS has emphasized. Security groups are orthogonal to NAT and private networks; AWS had security groups before it had VPCs.

saurik 2477 days ago

Just use security groups, which fully solved this problem without all of the overhead and complexity of VPC.

cle 2477 days ago

Defense in depth. Not having public routes to your database adds another layer of protection. You should have multiple, and they should be redundant.

Scarbutt 2477 days ago

Firebase was made specifically for the cloud, RDS is the cloud atop postgres, I don't know how secure RDS is (against the myriads of attacks) but it wouldn't be bad idea to use the built-in aws firewall to at least restrict access to trusted IPs ;)

Also, VPCs are really useful if you have many systems and services(yours or theirs) inside AWS.

dragonwriter 2477 days ago

> RDS is the cloud atop postgres

Or MySQL. Or SQL Server.

paxys 2476 days ago

Exposing a database to the public internet is a terrible idea. Yes, it's behind an auth layer, but is a username and password really enough protection for literally all of your company's data? Heck most people here have probably set up 2FA for their social media profiles, and for good reason.

paulddraper 2476 days ago

> Exposing a database to the public internet is a terrible idea.

Isn't that a core idea of Firebase? Or Dynamo?

paxys 2476 days ago

Not sure about Firebase, but DynamoDB can be behind your VPC. From what I know about Firebase, it's meant to be a backend for mobile apps, so I guess it makes sense for it to be public.

dragonwriter 2477 days ago

> AWS does tons of stuff around VPCs....I feel like they really want me to use them (or their customers really want to use them)

VPC is a very convenient fit for enterprise customers extending on-premises networks into the cloud, I think that's the market it's mainly focussed on.

> I know it's "public", but that seems irrelevant in the era of cloud services.

It's not irrelevant, but neither is it necessary critical all the time; there doesn't need to be a one-size- (or even one-shape-)fits-all universal approach to network security, and AWS encompasses a lot of different customer setups, including enterprises for which it is a virtual extensions of the on-premises internal network.

serkanh 2476 days ago

Say you have bunch of ec2 instances with public ip addresses that runs an application that makes calls to 3rd party service. Say that 3rd party service allows only access from certain ip ranges, would you rather give them a single ip or hundreds of ips for them to whitelist? What you say may be acceptable for small infra but not in large setup.

momokoko 2477 days ago

You need to realize that the point of AWS is lock in. Once your service becomes a ball of various AWS pieces, it becomes almost impossible to leave once you start scaling.

So there is always a priority towards things that cause more lock in like VPC.

AmericanChopper 2477 days ago

I don’t think AWS want you to use VPC at all. The Golden Path for serverless on AWS has always been “networkless”. If your use case fits into their stateless HTTP stack (API Gateway + Lambda + Dynamo + SQS...) then you’re gonna have a really easy time. The reason VPC is required is because not every use case is going to fit into that stack, and the fact that VPC functionality seems to be always just a little bit not good enough (in comparison) doesn’t make me think they’re pushing people towards it.

zten 2477 days ago

They definitely do if you're trying to use things based on EC2 instances. The newest types of instances have been VPC-only for years now.

AmericanChopper 2477 days ago

But if you’re using EC2, then you’ve already wandered far off the serverless path.