Hacker News new | ask | show | jobs
by vitorbaptistaa 456 days ago
My experience entails:

* Luigi -- extensive usage (4y+)

* Makefiles -- (15y+)

* GitHub Actions -- (4y+)

* Airflow -- little usage (<6 months)

* Dagster -- very little, just trying it out

* Prefect -- just followed tutorial

Although it lacks a lot of the monitoring and advanced web ui other platforms have (maybe because of it), Luigi is the simplest to reason about IMHO.

For a new project that will require complex orchestrations, I'd probably go with Dagster or Prefect nowadays. Dagster seems more complex and more powerful with its data lineage functionality, but I have very little experience with either tool.

If it's a simple project, a mix of Makefiles + GH Actions can work well.

1 comments

Is there anything even more lightweight, where you don't have to write your code any differently? For instance, say I have 10 jobs that don't depend on each other, all of them pretty small.

Dagster and even Luigi feel like overkill but I'd still like to plug those into a unified interface where I can view previous runs, mainly logs and exit codes. Being able to do some light job configuration or add retries would be nice but not required. For the moment I just use a logging handler that writes to a database table and that's fine

I think that Airflow 2 implemented a decorator mode which you can just use on functions.

Honestly, just use airflow, it has its issues but it sucks in well known and predictable ways.

One of the goals of Prefect's SDK is to be minimally invasive from a code-standpoint (in the simplest case you only need two lines to convert a script to a `flow`). Our deployment model also makes infrastructure job config a first-class citizen so you might have a good time trying it out. (disclosure: work at Prefect)
Love prefect! but for workflows involving concurrency, Prefect code needs to get somewhat invasive.

Prefect relies on prefect.task()-wrapped methods as the lowest granularity of concurrency in a program, and requires you to use the (somewhat immature) prefect task APIs to implement that concurrency.

more on this complaint here: https://austinweisgrau.github.io/migrating-to-prefect-part-3...

This is an excellent write up thank you for sharing! Yea, our concurrency API needs an upgrade - coincidentally this is going to be a theme of the next sprint or two so I hope I can report some improvements back soon.