Hacker News new | ask | show | jobs
by Joe8Bit 1912 days ago
Yeah, agreed. I was a Databricks skeptic when I first came across it, but it's value goes a LONG way beyond just managing Spark.

For example, we found that Databrick's Spark (or their 'Delta engine' or whatever it's called) had 50-60% better performance on our workloads than than 'core' Spark. I guess that's not surprising when a large proportion of Spark contrionutors work for you and can performance tune! Not to mention things like MLFlow and all their data engineering stuff.

This is a cool project, and I admire it's ambition, but saying it's a real 'alternative' to Databricks is a bit disingenuous.

1 comments

Databricks writes some good tools, but it can get pretty expensive. Kubeflow has been evolving well and is gaining lots of traction. It's pretty neat from my experience so far.