|
|
|
|
|
by cm
3610 days ago
|
|
I'm an engineer at Stitch. Our approach to transformation is to do just enough to move data from one system to another without losing precision or fidelity. So, we transform datatypes and structures into more appropriate forms for the target system, but we don't have any transformation operators like aggregation or windowing. We have found that this approach works well for our users, who prefer to get the rawest possible data, and the systems we target like Redshift that are themselves powerful transformation engines. This gives the user unlimited flexibility for defining transformations, and a full audit trail for understanding how their data has changed. We are always evolving, though, so if there's a use case that you think requires this approach, I would be eager to hear more about it. |
|
I've done a lot of ETL, mostly for healthcare.
Yes, engineers should be doing ETL work. Any "workflow engine" that promises patch cord or visual programming is hooey. At the end of the day, someone somewhere is gonna be writing some code. And its not the "business analyst" or "subject area expert". No, its a dev. And all that clever framework stuff is just an angry 800lb gorilla sitting between her and her work.
ETL is just fancy talk for data processing. Input, processing, output. Copy a string from a source, maybe mangle it a bit, paste that string somewhere else. Extra credit for type awareness, eg "oh! that string's a date!". Trophies for logging, alerts, and services which heal themselves.