|
|
|
|
|
by oldelpaso66
1067 days ago
|
|
I don't get why you claim something like airflow doesn't bridge the gap well with resear hers who write code. I've worked with wdl extensively, and I still think that airflow is a superior tool. The second I need any sort of branching logic in my pipeline, the ways of solving this feel like you are working against the tool, not with it. |
|
Some features that bridge the gap:
1. Command-line tools are often used in steps of a bioinformatics pipeline. The workflow managers expect this and make them easier to use (e.g. https://github.com/snakemake/snakemake-wrappers).
2. Using file I/O to explicitly construct a DAG is built-in, which seems easier to understand for researchers than constructing DAGs from functions.
3. Built-in support for executing on a cluster through something like SLURM.
4. Running "hacky" shell or R scripts in steps of the pipeline is well-supported. As an aside, it's surprising how often a mis-implemented subprocess.run() or os.system() call causes issues.
5. There's a strong community building open-source bioinformatics pipelines for each workflow manager (e.g. nf-core, warp, snakemake workflows).
Airflow – and the other less domain-specific workflow managers – are arguably better for people who have a stronger software engineering basis. For someone who moved wet lab to dry lab and is learning to code on the side, I think the bioinformatics workflow managers lower the barrier to entry.