| I echo other comments. Running and managing Airflow beyond simple jobs is complicated. But then if you are running and managing Airflow for simpler jobs, then you might not need Airflow. One data center company that I know of uses airflow at scale with docker and k8s. They have a huge team of devops just to manage the orchestrator. They in turn have to fine tune the orchestrator to run smoothly and efficiently. Similar to what shopify has noted here, they have built on top of and extended airflow to take care of pain points like point 4. For companies like this it makes sense to run airflow. Another issue I see companies/engineers who adopt airflow is that they use it as a substitute for a script than as an orchestrator. For example, say you want to download files from an API, upload to s3, load it to your warehouse (say snowflake) and do some transformations to get your final table - instead of writing separate scripts for each step of fetch/upload/ingest/transform and call each step from the dag, they end up writing everything as a task in a dag. A huge disadvantage is there is a lot of code duplication. If you had a script as a CLI, all your dag/task has to do is call the script with the respective args. I agree that airflow comes with a lot of convenience wrappers to create tasks for many things but I feel this results in losing flexibility. This also results in them tying their workflow with airflow and any change they might need they have to modify their airflow code directly. If you want to modify how/what you upload to s3, you end up writing/modifying python functions in the respective dags' code. This removes the flexibility to modify/substitute any component of the workflow with something else or even change the orchestrator from airflow to something else. Additionally, different teams might write workflows in different ways - standardization of practice is really hard. This in turn results in pouring more investments to maintaining and hiring "airflow data engineers". Companies fall into steep tech debts. Prefect/dagster are new orchestrators in town. I'm yet to try them out but I've heard mixed reviews about them. EDIT: Forgot about upgrades. Lot of upgrades are breaking changes esp the recent change from 1->2. You end up spending a lot of time just trying to debug what went wrong. Just installing and running it is a pain. |