Hacker News new | ask | show | jobs
by latebird22 1480 days ago
Interesting thought. I can see your case for some growing companies. However I'm not sure if you would always need Kubernetes for this. What would you like adjust when we grow into a 1000 person company? I think you have still similar requirements like autoscaling, resource allocation, zero-downtime deployments etc. This is also possible without direct access to the container management.

And there are a lot of companies which do not become the next Unicorn and need an easy way to manage their container workloads.

SetOps currently uses ECS since it comes with no additional overhead costs for the management plane/API and does the container management job well enough. However this is not a definite decision and ECS could be replaced in the future. The main point is that there is a simple abstraction for users managing the workloads and that the "backend" is interchangeable.

2 comments

Here's a few examples of policies/needs that I've seen companies run up against at scale. K8s does a great job solving them.

* Understanding which workloads share a node's memory/CPU, and isolating certain workloads for security reasons

* Running specific workloads on specific instance types (e.g. with GPU or extra CPU)

* Configuring network policy between workloads

* Airgapping certain workloads

* Setting priority levels for different workloads, so some scale more rapidly while others have to wait for a new node to be provisioned

* Customized scaling behavior (e.g. based on the depth of a queue or latency metrics)

* Multi-region support for DR

I could probably go on :)

You are totally right. For these specific use cases you probably want full control. :) I guess if you have these requirements a tool like SetOps, which simplifies the management, might not be the right fit then.

Although some of these requirement, like running specific workloads on specific instance types, could easily be implemented.

> SetOps currently uses ECS

Not everything is a stateless HTTP microservice. Solutions like ECS start to fall apart when you try to run stateful workloads, especially when the lifecycle of the workloads needs to be coordinated to prevent loss of availability or data (i.e. cannot tolerate 2/3 of the containers being knocked offline at the same time). AWS does not offer a managed datastore (e.g. RDS) for every datastore, and many of the datastores it does offer (e.g. MSK) are "let's tick this box in the quest for covering all our customer needs" but not cost-effective for production workloads.

Maybe, as a product, you make a decision to tell your customers, when you need to run something like that, go hire DevOps and migrate off. But you'll be more credible if you're up-front with what kinds of workloads you don't intend to support, so that customers who have a strategic vision for engineering can say, hey these guys will be great for me for the next few years, now I'm more likely to buy in.

They do have Batch, which IIRC was originally only for ECS, for that scenario.

It's quite horrible in many ways but not I don't think any of it is really because the ECS model is somehow inherently unsuitable for stateful long-running workloads, it's Batch being a bad product.

Both ECS and Batch are just leaky abstractions enough that some unknown unknown or forgotten nuance will come along and bite you when you least need it and make you question the value proposition.