|
|
|
|
|
by mavam
957 days ago
|
|
We want to achieve something similar with our pipelines [1] by making the beginning and the end of the pipeline symmetric, giving you this flow: 1. Acquire bytes (void → unstructured) 2. Parse bytes to events (unstructured → structured) 3. Transform events (structured → structured) 4. Print events (structured → unstructured) 5. Send bytes (unstructured → void) The "Publish" part is a combination of (4) and (5). Sometimes they are fused because not all APIs differentiate those steps. We're currently focusing on building blocks (engine, connectors, formats) as opposed to application-level integrations, so turnkey Reverse ETL is not near. But the main point is that the symmetry reduces cognitive effort for the user, because they worked that muscle on the "E" side already and now just need to find the dual in the docs. [1] https://docs.tenzir.com/blog/five-design-principles-for-buil... |
|
The primitives of many of these ETL systems are structured tables (snowflake, parquet, pandas dataframes, whatever) and I don't think I'd ever choose bytes over structured tables. The unstructured parts of data systems I've worked on have always chewed up an outsize portion of labor with difficult to diagnose failure modes. The biggest cognitive effort win of reverse ETL solutions has been to make external systems and applications "speak table".