Solves might be strong, but it removes a big portion of the cold start latency that was difficult to optimize for and out of the control of developers. Creating minimal images isn't difficult for a number of environments (e.g. webpacking your node.js lambdas) and barring necessarily large images (think pandas on Lambda) this puts a lot of control for the cold start p99 back in the hands of customers.
Warming functions in the previous VPC architecture was always a questionable practice. You had no guarantee that your environments would be warm across all subnets or which subnets would handle incoming requests. Beyond that, what happens to requests which you receive when the function is being warmed? You still incur cold starts.
There has never been a guarantee of environment reuse. Any architecture which isn't capable of incurring cold starts is not a good fit for serverless.
Sorry but it does not matter how many since everything is automated and you create the warm up scheduler when you create the function. As other pointed out in this thread that are other challenges with this approach.
>> Just use Fargate
We were trying to and we decided that is not our cup of tea. Lambdas are.
Yes it does matter. In your scheduler, how do you ensure your ping (the way you start an instance) is actually creating another instance to keep warm or reusing another instance?
If you want to always keep 20 instances warm, you have to keep the first ping active until the 20th one is done.
In other words, if you want to keep 20 active instances warm and you send 20 requests in 5 seconds, if each request only takes .25 seconds. You will only have 5 warm lambdas. The 6th real concurrent connection will still have a cold start. Also while you are pinging the request to keep it warm, that instance can serve a real user.
Also, API Gateway has an algorithm to decide whether to launch a new lambda are cache a request hoping that using an already warm lambda will free up.
Overall, definitely a big win!