| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gwerbin 55 days ago
	Sorta? Maybe I'm weird. I tend to use Parquet files inside my project instead of reading directly from and writing directly to our data warehouse. That lets me cut out a lot of overhead spent on just waiting for data to flow over the network, and also as a side benefit lets me track everything with DVC, which itself has a lot of benefits like being able to summon all project data with `dvc pull`. I consider that a completely distinct use case from, say, Iceberg tables in S3.