Hacker News new | ask | show | jobs
by caravel 2848 days ago
[full disclosure, Airflow committer here] I've never heard of "HP Operation Orchestration", but that looks like a drag and drop enterprise tool from a different Windows-GUI era. Airflow is very different, it's workflows defined as code which is a totally different paradigm. The Airflow community believes that when workflows are defined as code, it's easier to collaborate, test, evolve and maintain them. Though maybe the HP tool exposes an API?

To address some of your comments:

- About stability, I'm not sure what version you've used, or which executor you were using, but if stability was a concern I don't think we'd have such a large and thriving community. Nothing is perfect, but clearly it's working well for hundreds of companies.

- About stopping jobs, if the task instance state is altered or cleared (through the web ui or CLI), the task will exit and the failure will be handled. Earlier versions (maybe 1-2 years back?) did not always do that properly.

- About Python: Python 3.7 was released late June 2018, and I think there are PRs addressing the `async` issue already. We fully support 2.7 to 3.6, and 3.7 very soon. You need to give software a chance and a bit of time to adapt to new standards. I wonder which % of pypi packages support 3.7 at this point, or how many have 3.7 in their build matrix, but my guess is that it's very low.

- XCOMs are fine in many cases, though if you're not passing data or metadata from a task to another that doesn't mean that there's no context that exist for the execution of the task. We recommend having determinisc context and data locations (meaning the same task instance would always target the same partition or data location).

- Dynamic: the talk linked to above clarifies what kind of dynamism Airflow supports. It's very common to build DAGs dynamically, though the shape of the DAG cannot shape at runtime. Conceptually an Airflow DAG is a proper directed acyclic graph, not a DAG factory or many DAGs at once. Note that you can still write dynamic DAG factories if you want to create DAGs that change based on input.

- No optimization: the contract is simple, Airflow executes the tasks you define. While you can do data-processing work in Python inside an Airflow task (data flow), we recommend to use Airflow to orchestrate more specialized systems like Spark/Hadoop/Flink/BigQuery/Presto to do the actual data processing. Airflow is first and foremost an orchestrator.

- DAG parsing timeouts are configurable. DAG creation times for large DAGs have been improved in the past versions. But clearly it's easier for humans to reason about smaller DAGs. DAGs with thousands of tasks aren't ideal but they are manageable.

One thing to keep in mind is that thriving open source software evolves quickly, and Airflow gets 100+ commits monthly, and has dozens of engineers from many organizations working on it full time. From what I can see it's clearly the most active project in this space.

1 comments

HP OO is an entire system that has an XML structure command structure to customize the job that you're working on. It has a GUI that is used to build out the flows, run and test. It has a backend system to audit, admin, and visualize the current process, and it has workers to scale out the work. It's a bit of a more mature setup.

The claim that the setup of your workflow has to be code isn't necessarily a good thing. Your recipe should be descriptive, not imperative.

----

To answer your response:

1. Stability: I was working with apache-airflow 1.9 (last release: Jan2018) 1.10 was just released 2 days ago. I frequently had issues where deleting more than 3 tasks would cause that mushroom cloud error message. Also, I've had cases where the task could max out on memory and take the whole system down.

2. 1.9.0: Stopping jobs: I saw this issue where that a task would be running, I would stop+delete the task and start a new one. I frequently saw the case where I had to wait a while for the triggered dag to continue running.

3. Python3.7: Yes, it was addressed on the PR. However, for things like that we (the users) need a quick turn arround/hotfix for stuff like that. It got released late (lets say 27 June, and the latest version 1.10 was 28 August [with a 7month gap]) It's just painful to have this upgrade just break something internally in Airflow.

4. From what I've seen in situations where the work for the task is huge is that there is an expectation of the task to handle the workload and splitting up the workload it's self. (Since you can't define a span out of the tasks based on the workload) That's no beuno.

Timeouts: From what I have seen there are issues where the next dag run scheduled can interfere with the last one. This is an issue given the timeouts, retries, and reoccurring schedules. (yes you can say.. that's user's choice.. however, workloads and performance can change without notice)

----

Another issue I had: There is no way to trigger a task and it's depending tasks without triggering the whole dag. This makes long-running dags with lots of tasks difficult to debug and test.

Also there is a slight difference between airflow run (task) and test. Sometimes you use one vs the other.