| I can offer one data point. This is from purely startup-based experience (seed to Series A). A while ago I moved from microservices to monolith because they were too complicated and had a lot of duplicated code. Without microservices there's less need for a message queue. For async stuff, I used RabbitMQ for one project, but it just felt...old and over-architected? And a lot of the tooling around it (celery) just wasn't as good as the modern stuff built around redis (bullmq). For multi-step, DAG-style processes, I prefer to KISS and just do that all in a single, large job if I can, or break it into a small number of jobs. If I REALLY needed a DAG thing, there are tools out there that are specifically built for that (Airflow). But I hear they're difficult to debug issues in, so would avoid at most costs. I have run into scaling issues with redis, because their multi-node architectures are just ridiculously over-complicated, and so I stick with single-node. But sharding by hand is fine for me, and works well. |
For example, you need some ETL to happen every day. Instead of having your pipeline logic inside an airflow task, you put your logic in a library, where you can test and establish boundaries for this behavior in isolation, and compose this logic portably into any system that can accept your library code. When you need to orchestrate, you just call this function inside an airflow task.
This has a few benefits. You now decouple, to a significant extent, your logic and state transfer from your orchestration. That means if you want to debug your DAG, you don’t need to do it in Airflow. You can take the same series of function calls and run them, for example, sequentially in a notebook and you would achieve the same effect. This also can reveal just how little logic you really need in orchestration.
There are some other tricks to making this work really well, such as reducing dependency injection to primatives only where possible, and focusing on decoupling logic from configuration. Some of this is pretty standard, but I’ve seen teams not have a strong philosophy on this and then struggle with maintaining clean orchestration interfaces.