|
|
|
|
|
by nchammas
1947 days ago
|
|
Interesting project! (The interactive slides are cool btw.) Could you share a bit about how engineers express data transformations in Flow? From a quick look at the docs, it doesn't look like you are using SQL to do this, which is interesting since it bucks the general trend in data tooling. |
|
It's only on you to write mapper functions -- combine & reduce happens automatically, using reduction annotations of the collection's JSON-schema. Think of it as a SQL group-by where aggregate functions have been hoisted to the schema itself.
Here's a worked-up narrative example using Citi Bike system data: https://estuary.readthedocs.io/en/latest/examples/citi-bike/...
| it bucks the general trend in data tooling
You can say that again. It's a bet, and I don't know how it will work out. I do think SQL is a poor fit for expressing long-lived workflows that evolve over joined datasets / schemas / transformations, or which are more operational vs analytical in nature, or which want to bring in existing code. Though I _completely_ understand why others have focused on SQL first, and it's definitely not an either / or proposition.