|
|
|
|
|
by drawnwren
676 days ago
|
|
Genuine question: how are you handling load balancing, log aggregation, failure restart + readiness checks, deployment pipelines, and machine maintenance schedules with these “simple” setups? Because as annoying as getting the prometheus + loki + tempo + promtail stack going on k8s is —- I don’t really believe that writing it from scratch is easier. |
|
* Log aggregation happens out of the box with CloudWatch Logs and CloudWatch Log Insights. It's configurable if you want different behavior
* On ECS, you configure a "service" which describes how many instances of a "task" you want to keep running at a given time. It's the abstraction that handles spinning up new tasks when one fails
* ECS supports ready checks, and (as noted above) integrates with ALB so that requests don't get sent to containers until they pass a readiness check
* Machine maintenance schedules are non-existent if you use ECS / Fargate, or at least they're abstracted from you. As long as your application is built such that it can spin up a new task to replace your old one, it's something that will happen automatically when AWS decommissions the hardware it's running on. If you're using ECS without Fargate, it's as simple as changing the autoscaling group to use a newer AMI. By default, this won't replace all of the old instances, but will use the new AMI when spinning up new instances
But again, though: the biggest selling point is the lack of maintenance / babysitting. If you set up your stack using ECS / Fargate and an ALB five years ago, it's still working, and you've probably done almost nothing to keep it that way.
You might be able to do the same with Kubernetes, but your control plane will be out of date, your OSes will have many missed security updates. Might even need a major version update to the next LTS. Prometheus, Loki, Tempo, Promtail will be behind. Your helm charts will be revisions behind. Newer ones might depend on newer apiVersions that your control plane won't support until you update it. And don't forget to update your CNI plugin across your cluster, too.
It's at least one full time job just keeping all that stuff working and up-to-date. And it takes a lot more know-how than just ECS and ALB.