Hacker News new | ask | show | jobs
by mariusae 2968 days ago
Reflow [1] is also well-suited for ETL workloads. It takes a different tack: it presents a DSL with data-flow semantics and first-class integration with Docker. The result is that you don't write graphs, instead you just write programs that, due to their semantics, can be automatically parallelized and distributed widely, all intermediate evaluations are memoized, and programs are evaluated in a fully incremental fashion:

[1] https://github.com/grailbio/reflow