| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mrtranscendence 1205 days ago

Databricks is fine. I wasn't happy using it until they implemented the ability to work in a git repo, with proper file support, but that's gone some way to making it more usable to me. The interface sucks pretty hard, slowing down and using a significant amount of memory with only modestly high number of cells (where a Jupyterlab notebook would remain very snappy). I also wish there were a better story for local development; they've addressed this to some degree recently but I'm not sold on their solution.

It's certainly better than what we did prior to Databricks, which was roll our own in-house provisioning and notebook solution. I won't/can't go into too many details, but not only was it cumbersome and very buggy, but it was as if they designed it to encourage data scientists to spend as much money on compute as possible (only to panic at the millions they were spending). They dropped it for cost reasons, which is hilarious given how expensive Databricks is.

I do appreciate the work Databricks have done improving Spark. Capabilities like adaptive query execution have made optimization significantly easier.

1 comments

sandkoan 1205 days ago

When you say you wish they had a "better story for local development," what do you mean? What do you wish for?

link