|
|
|
|
|
by trumpeta
1495 days ago
|
|
We operate a (small?) Airflow instance with ~20 DAGs but, one of those dags has ~1k tasks. It runs on k8s/aws setup with a MySQL backing it. We package all the code in 1-2 different Docker images and then create the DAG. We've faced many issues (logs out of order, missing, random race conditions, random task failures, etc.) But what annoys me the most is that for that 1 big DAG, the UI is completely useless, tree view has insane dupplication, graph view is super slow and hard to navigate through and answering basic questions like, what exactly failed and what nodes are around it are not easy. |
|
In more recent versions of Airflow, TaskGroups (https://airflow.apache.org/docs/apache-airflow/stable/concep..., https://www.astronomer.io/guides/task-groups/ ) were made to help this a little bit. Hopefully that helps a bit.
At ~1k nodes in the graph introspection becomes hard anyway, as others have suggested, breaking it down if possible might be a good idea.