Hacker News new | ask | show | jobs
by nchammas 1947 days ago
Interesting project! (The interactive slides are cool btw.)

Could you share a bit about how engineers express data transformations in Flow?

From a quick look at the docs, it doesn't look like you are using SQL to do this, which is interesting since it bucks the general trend in data tooling.

1 comments

You write pure-function mappers using any imperative language. We're targeting TypeScript initially for $reasons, followed by ergonomic OpenAPI / WASM support, but anything that can support JSON => JSON over HTTP will do.

It's only on you to write mapper functions -- combine & reduce happens automatically, using reduction annotations of the collection's JSON-schema. Think of it as a SQL group-by where aggregate functions have been hoisted to the schema itself.

Here's a worked-up narrative example using Citi Bike system data: https://estuary.readthedocs.io/en/latest/examples/citi-bike/...

| it bucks the general trend in data tooling

You can say that again. It's a bet, and I don't know how it will work out. I do think SQL is a poor fit for expressing long-lived workflows that evolve over joined datasets / schemas / transformations, or which are more operational vs analytical in nature, or which want to bring in existing code. Though I _completely_ understand why others have focused on SQL first, and it's definitely not an either / or proposition.