Hacker News new | ask | show | jobs
by hulahoof 1588 days ago
This reflects how I often deploy Airflow as well (usually on GCP as Composer)

We use DBT to manage the DAG for the BQ transformations, put this in a container and deploy it into the kubernetes cluster that airflow is running on as a single node.

Airflow can then handle the scheduling and DAG nodes for non DWH dependencies such as loading/checking for files, kicking off tasks that need to run after the DWH refresh and the like.

I find once it is set up it is extremely easy for small teams to follow the pattern, and the single view of all the pipelines running is a great benefit - as well as handling the logic around last successful runs etc., that would need to be implemented manually if using simple cron jobs.