|
|
|
|
|
by hintymad
703 days ago
|
|
I wonder how many iterations we will need before engineers are happy with a workflow solution. Netflix had multiple solutions before Maestro, such as metaflow. Uber built multiple solutions too. Amazon had at least a dozen internal workflow engines. It's quite curious why engineers are so keen on building their own workflow engines. Update: I just find it really interesting that many individuals in many companies like to build workflow engines. This is a not deriding comment towards anyone or Netflix in particular. To me, such observation is worth some friendly chitchat. |
|
In reality there are five main concerns: 1. Resource scheduling-- "I have a job or collection of jobs to run... allocate them to the machines I have" 2. Dependency solving-- If my jobs have dependencies on each other, perform the topological sort so I can dispatch things to my resource scheduler 3. API/DSL for creating jobs and workflows. I want to define a DAG... sometimes static, sometimes on the fly. 4. Cron-like functionality. I want to be able to run things on a schedule or ad-hoc. 5. Domain awareness-- If doing ETL I want my DAGs to be data aware... if doing ML/AI workflows then I want to be able to surface info about what I'm actually doing with them
No one solution does all these things cleanly. So companies end up building or hacking around off the shelf stuff to deal with the downsides of existing solutions. Hence it's a perpetual cycle of everyone being unhappy.
I don't think that you can just spin up a startup to deliver this as a "solution". This needs to be solved with an open source ecosystem of good pluggable modular components.