Hacker News new | ask | show | jobs
by polskibus 2534 days ago
Apart from being cloud-focused, what's the difference between dataform and a classic ETL engine like SQL Server Integration Services?
1 comments

Great question. Beyond being on the cloud and everything that goes with that, I'd say that there are two fundamental differences with a product like SSIS.

1. Dataform focuses on transformations happening _inside_ the cloud data warehouse. We don't move data between systems. More and more businesses are moving from traditional ETL processes to ELT and centralising their raw data in their warehouse. Dataform help businesses manage the T in ELT.

2. Dataform is built with software engineering best practices in mind. In Dataform, all your transformations are written in code (mostly SQL) instead of a GUI like SSIS. The code can be version controlled, edited and fits better the large amount of transformations teams have to deal with.

Is there a tutorial to run it on-prem? For example for development or testing purposes?

What data warehouses are currently and which are on the roadmap? Is ClickHouse somewhere in your plans?

On-prem: Right now our IDE is only available as SaaS, although we will be looking at this in the near future. You can develop and test projects with the CLI and deploy them yourself but no tutorials for setting this up beyond the basics yet: https://docs.dataform.co/guides/command-line-interface/

Warehouse support: Athena/Presto and Azure are top of mind. I've not come across ClickHouse before but I'll definitely add it to our tracker!