|
|
|
|
|
by EdwardDiego
1681 days ago
|
|
Very much true. I saw a joke tweet recently something along the lines of - It's amazing how many data engineering scaling issues these days are being solved by just paying Snowflake more money. Spark does take a lot of tuning, but then I'm guessing Databricks offer that service as part of your licensing fee? (I'd hope so if they're selling a product based on FOSS code, there has to be a value add to justify it) |
|
They have some proprietary features like DBIO [1]. They also have some cloud-specific features like storage autoscaling [2] that would not be available in OSS Spark. Even Delta Lake [3] used to be proprietary, but I suspect the rise of open-source frameworks like Iceberg led them to open-source it.
Shameless plug - when working at a since-shutdown competitor to Databricks, I'd come up with storage autoscaling long before them [4], so it's not unlikely that they were "inspired" by us :-) .
1. https://docs.databricks.com/spark/latest/spark-sql/dbio-comm...
2. https://databricks.com/blog/2017/12/01/transparent-autoscali...
3. https://delta.io/
4. https://www.qubole.com/blog/auto-scaling-in-qubole-with-aws-...