|
|
|
|
|
by zukzuk
1588 days ago
|
|
Everyone's context is different, but I've found the exact opposite to be true. Airflow is simple and dumb enough that it can be easily understood and managed by a small team, but it's also flexible and powerful enough that we can't come up with a good enough reason to switch to anything else.* *We are, however, becoming more and more reliant on dbt, and the article makes a good point about Airflow providing no visibility for what's going on in a dbt node. So we're ending up with an increasingly simpler Airflow dag, with most of the complexity hidden inside a single dbt node. |
|
We use DBT to manage the DAG for the BQ transformations, put this in a container and deploy it into the kubernetes cluster that airflow is running on as a single node.
Airflow can then handle the scheduling and DAG nodes for non DWH dependencies such as loading/checking for files, kicking off tasks that need to run after the DWH refresh and the like.
I find once it is set up it is extremely easy for small teams to follow the pattern, and the single view of all the pipelines running is a great benefit - as well as handling the logic around last successful runs etc., that would need to be implemented manually if using simple cron jobs.