|
|
|
|
|
by thalesmello
2964 days ago
|
|
As the one who implemented Airflow at my company, I understand how overwhelming it can be, with the DAGs, Operators, Hooks and other terminologies. This looks like a good enough mid-term alternative. However, I have a few questions (which I couldn't find easily in the homepage, sorry if I skipped something): - Do you have a way of persisting connection information? I saw an example of how to create a connection, but it isn't clear if the piece of code has to be loaded every time you execute the ETL - How easy it is to implement new computation engines? - Plans of creating a command line to make it easier to execute operations? |
|
Connection information is configured in code through [1], see [2] for an example.
It's very easy to run other workloads. Either by directly invoking Python functions from tasks or by writing own commands (operators)[3].
There is a command line. It's the interface for running from external schedulers (jenkins, cron)[4] & [5]
[1] https://github.com/mara/mara-db
[2] https://github.com/mara/mara-example-project/blob/master/app...
[3] https://github.com/mara/data-integration/blob/master/data_in...
[4] https://github.com/mara/data-integration/raw/master/docs/exa...
[5] https://github.com/mara/data-integration/raw/master/docs/exa...