Hacker News new | ask | show | jobs
by georgewfraser 2011 days ago
What is not said in this article is that you can use modern data warehouses, like Snowflake and BigQuery, in the exact same way: a single system that serves as both your data lake and your data warehouse. Databricks and the cloud data warehouses are rapidly converging. Databricks has enough SQL functionality that it can be reasonably be called an RDBMS, and Snowflake has demonstrated that you can incorporate the benefits of a data lake into a data warehouse by separating compute from storage. At this point, the main difference is that with Databricks you can directly access the underlying Parquet files in S3. Does that matter? For some users, yes.
1 comments

There isn’t too much preventing a data warehouse provider from providing storage access, as long as they are willing to maintain some semblance of backward compatibility in the format.
That's exactly what Snowflake lets you do. I worked with an analytics vendor that was using Snowflake behind the scenes, and rather than ETL/copy the data over we had the option of just pointing our Snowflake compute at their Snowflake storage.