Hacker News new | ask | show | jobs
by mumblemumble 2332 days ago
It's helpful in other domains, too.

For example, in ETL pipelines, I would greatly prefer to have an entire DAG go down quickly and noisily than to risk having it generate incorrect data. It's the difference between a crappy morning, and a crappy day or even a crappy week.

2 comments

Just curious since a lot of my work is on an ETL - what are your favorite DAG libraries/approaches? Some of my ETL workflows can run in parallel with each other because there aren't data dependencies between them, and others should definitely crash loudly
We use a home-grown solution, and try to keep it pretty minimal. For example, we can run all our pipeline steps serially and still get all the work done plenty fast, so, for now, we're keeping that parallelism can of worms firmly shut.
Or ETL approach is all based on elixir (erlang) and it has some drawbacks, but ease of debugging is not one of them. No libraries, just code.
> a crappy day or even a crappy week

Or a crappy discovery at the end of the quarter...

Or crappy decisions.