Hacker News new | ask | show | jobs
by eunoia 2692 days ago
Out of curiosity, what’s the use case for running ECS on EC2 (instead of using Fargate) these days?
7 comments

AWS employee here. If you are able to achieve consistent greater than 50% utilization of your EC2 instances or have a high percentage of spot or reserved instances then ECS on EC2 is still cheaper than Fargate. If your workload is very large, requiring many instances this may make the economics of ECS on EC2 more attractive than using Fargate. (Almost never the case for small workloads though).

Additionally, a major use case for ECS is machine learning workloads powered by GPU's and Fargate does not yet have this support. With ECS you can run p2 or p3 instances and orchestrate machine learning containers across them with even GPU reservation and GPU pinning.

I'm not totally up to speed on ECS vs EKS economics but it seems like EKS with p2/p3 would be a sweet solution for this. Even better if you have a mixed workload and you want to easily target GPU-enabled instances by adding a taint to the podspec.
Kubernetes GPU scheduling is currently still marked as experimental: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus...

ECS GPU scheduling is production ready, and streamlined quite a bit on the initial getting started workflow due to the fact that we provide a maintained GPU optimized AMI for ECS that already has your NVIDIA kernel drivers and Docker GPU runtime. ECS supports GPU pinning for maximum performance, as well as mixed CPU and GPU workloads in the same cluster: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/...

Apart from pricing and the potential to overcommit resources on EC2-ECS, there are a couple of other differences.

One is your options for doing forensics on Fargate. AWS manage the underlying host so you give up the option of Doing host level investigations. It’s not necessarily worse as you can fill this gap in other ways.

Logging is currently only via CloudWatch logs so if you want to get logs into something like Splunk you’ll have to run something that can pick up these logs. You’ll have that issue to solve if you want logs from some other AWS services like Lambda to go to the same place. The bigger issue for us is that you can’t add additional metadata to log events without building that into your application or getting tricky with log group names. On EC2 we’ve been using fluentd to add additional context to each log event like the instance it came from, the AZ, etc. Support for additional log drivers on Fargate is on the public roadmap[1][2] so there will hopefully be some more options soon.

[1] Fargate Log driver support v1 https://github.com/aws/containers-roadmap/issues/9 [2] Fargate log driver support v2 https://github.com/aws/containers-roadmap/issues/10

At least one is, that you get to use the leftover CPU and memory for your other containers when you use an EC2 instance. With some workloads this lets you to overcommit those resources if you know all your containers won't max out simultaneously.

Edit: another one is that you can run ECS on spot fleet and save some money.

Fargate is orthogonal to ECS and can be used together. The difference is that instead spinning VMs as hosts and configuring them for ECS and worrying about spinning just right amount of them, you select fargate, which does all of that behind the scenes (kind of like lambda), but the VM instances provided by fargate are a bit more expensive.
A large part of our ECS capacity is running on spot instances which are much cheaper.
Perhaps cost?
Cost and legacy.