|
|
|
|
|
by gwerbin
4 days ago
|
|
Sorta? Maybe I'm weird. I tend to use Parquet files inside my project instead of reading directly from and writing directly to our data warehouse. That lets me cut out a lot of overhead spent on just waiting for data to flow over the network, and also as a side benefit lets me track everything with DVC, which itself has a lot of benefits like being able to summon all project data with `dvc pull`. I consider that a completely distinct use case from, say, Iceberg tables in S3. |
|