|
|
|
|
|
by ramkiz
74 days ago
|
|
Very cool idea. The part I would love to hear more about is how you are thinking about the boundary between notebook/IDE convenience and actual data lake guarantees. For example, what exactly is versioned, how reproducible are transformations, and how much lineage visibility do I get once I start mixing SQL, PySpark, natural language queries, and imported web/DB data? |
|
You will get job run level lineage for any datasets created in the system.