|
|
|
|
|
by sys13
880 days ago
|
|
(I do some training for Databricks)
1. Yeah, cluster startup time can be not fun. Here are some solutions:
- pools (keeps instances around so you don't have to wait for the cloud to provision them
- serverless SQL warehouses (viable if you're doing only SQL)
- one job with multiple tasks that share the same job cluster. Delta Live Tables does a similar thing but with streaming autoscaling
- streaming: cluster never needs to go down. Can share multiple streams on the same cluster so they load balance each other |
|
I see a lot of companies that get sold on Databricks and then are surprised by the cost.