| HN Mirror

[1] http://www.vldb.org/pvldb/vol13/p1206-dreseler.pdf [2] https://stratos.seas.harvard.edu/files/stratos/files/columnstoresfntdbs.pdf [3] https://web.stanford.edu/class/cs245/readings/c- store.pdf [4] http://sites.computer.org/debull/A12mar/vectorwise.pdf

I agree with everything above. The main advantage the newer data warehouses have over the legacy on-prem incumbents is that they had the chance to build from scratch having learned from all of the challenges that the original players encountered.

The public pissing contest is entertaining while also being silly and slightly cringe, but I think it's a nice story for Databricks nonetheless. They now have a performant SQL-based analytics engine that can credibly compete with the best DWs in the market today, and it's just one part of their overall platform.

The sense I get is that Snowflake wants the conversation to be "no matter what you do, you need a data warehouse, and we're the best in the business at that." Databricks' Lakehouse approach is a fundamental challenge to that, and if they're getting this kind of performance from their analytics engine against the market-leading data warehouses today, that's a big momentum shift in their favour.