Hacker News new | ask | show | jobs
by nerdponx 3446 days ago
This post is making me think it would be a great educational exercise to construct equivalent data processing flows in some popular tools: Make, Airflow, Luigi, Snakemake, Rake, others?
1 comments

Indeed, not only for education, but also as a tool to evaluate tools for various use cases, I think. Have been thinking the same and looked hard for anything like a set of evaluation workflows, incorporating various specific "motifs" if you like (such as nested parameter sweeps).

Unfortunately haven't found anything, so for our use cases in bioinformatics, I basically took an example workflow that was used in a course in next-gen sequencing analysis as a starting point:

https://github.com/NBISweden/workflow-tools-evaluation/tree/...

Only partly implemented it in Common Workflow Language [1] and SciPipe [2] so far ... the implementation turned out to take a tremendous of work :P

Much interested if anyone has found / created a more general such set of example workflows.

[1] http://commonwl.org

[2] https://github.com/scipipe/scipipe

Yes, thank you! I'll see if maybe I can throw something similar together for a social science data project, like a Titanic dataset run-through.