|
|
|
|
|
by jstephan
2232 days ago
|
|
(Former Databricks software engineer speaking)
The pain point they didn’t solve (well enough) is Spark cluster management and configuration. From our experience and user interviews, it’s the critical pain point that still slows down Spark adoption. Through our automated tuning feature, we’re going further than them to provide a serverless experience to our users. This being said, Databricks is a great end-to-end data science platform, with notable features we lack like collaborative hosted notebooks. A lot of people don’t want/need the full proprietary feature set of Databricks though. They choose to build on EMR, Dataproc, and other platforms instead. We hope they’ll try Data Mechanics now :) |
|
One thing I constantly deal with is how to optimize spark, how to use ganglia and spark ui to dig into what is causing data skew and slowness while running jobs. Is this something that you do better than databricks?