| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cm 3399 days ago

We don't currently have use cases that require heavy transformations (see this blog post I wrote to explain why: https://blog.stitchdata.com/why-our-etl-tool-doesnt-do-trans...).

However, since Singer is built around piping data between applications, your suggestion - to code something that sits between taps and targets - makes perfect sense. The whole "flow" would look like:

$ tap-mydatasource | do-aggregations | target-mytarget

We'd be eager to hear from anyone who tries this approach!

1 comments

jakestein 3399 days ago

The only thing I'd add from Chris's blog post is that in the workflow we tend to see is that most of the transformations tend to be done after loading into the destination. For example, in Redshift the transformations could be defined in SQL or Python UDFs.

link