Hacker News new | ask | show | jobs
by vekker 724 days ago
Does this also manage the infrastructure side of ETL? Usually some parts in a complex ETL process take a lot more processing power, so are run on different machines. From a quick glance at this, it seems like a WYSIWYG ETL tool for running ETL jobs on one machine?
1 comments

Thanks for your question. Amphi generates Python code using Pandas and can scale on a single machine or even multiple machines using Modin, but the process is manual for now. Future plans include deploying pipelines on Spark clusters and other services such as Snowflake.
what about dask?
Using Modin, deploying the pandas code on Dask should be possible: https://modin.readthedocs.io/en/stable/development/using_pan...