Hacker News new | ask | show | jobs
by mdaniel 451 days ago
> What issues do you have with Airflow?

Their operational perspective is catastrophic; how does one view the logs for a dag through the UI[1]? Why can't it store the python in the database they have attached to their deployment, versus making me jump through 80,000 hoops to put the files in the right magic directory on disk of every worker[2]?

1: no, not <https://airflow.apache.org/docs/apache-airflow/stable/ui.htm...> I mean the log, you know, like in the old days of $(tail -f /var/log/the.thing). I'm open to the answer hiding somewhere in this gobbledygook <https://airflow.apache.org/docs/apache-airflow/stable/admini...> but who is the target audience for having such a fancy UI and omitting log viewing from it, doubly so if there's some alleged http just for viewing logs

2: https://airflow.apache.org/docs/apache-airflow/stable/core-c... and double-plus-good anytime python software mentions PYTHONPATH -- that's how you know you're in for a hot good time https://airflow.apache.org/docs/apache-airflow/stable/admini...

1 comments

We deploy on K8s in OpenStack from a scheduled GitHub Actions pipeline which aggregates DAGs into a new container build based on hashes of hashes. This works well with almost no intervention.

WRT your 1, above any DAG output to stdout/err is available via the logs tab from the graph view of the individual tasks. Almost all our DAGs leverage on the PythonOperator though, not sure if that standardises this for us and your experience is muddied by more complexity than we currently have?

WRT 2. we generate an uber requirements.txt running pyreqs from the pipeline and install everything in the container automatically. Again no issues currently - although we do need to manually add the installation of test libraries to the pipeline job as for some reason auto-discovery is flakier for unit-tests frameworks.