I have looked a bit at code examples of Airflow, but was worried that it seems to have a similar problem as a lot of other pipeline tools: That in the main workflow specification, dependencies are specified between tasks only, not between the individual inputs and outputs of each task (between tasks rather than data).
This means that this info needs to be implemented "manually" in some less declarative manner somewhere else, breaking the declarative-ness of the workflow specification.
We wrote a package to go with our Airflow installation to borrow some of the data flow (as opposed to Airflow's exclusive task deps flow you mention) concepts we liked from Make/Drake/Luigi. You may be interested: github.com/industrydive/fileflow
That's nice! Didn't know Airflow did in-memory passing (as I now understand it does?), so I can see that this must be needed for larger data items, right?
Does it also help with making it easier to route individual multiple outputs to separate downstream components etc?
This means that this info needs to be implemented "manually" in some less declarative manner somewhere else, breaking the declarative-ness of the workflow specification.
I have posted about it some time ago here, mentioning AirFlow specifically: http://bionics.it/posts/workflows-dataflow-not-task-deps