|
|
|
|
|
by flowerlad
2232 days ago
|
|
Running Spark on a Kubernetes cluster is already pretty easy, so it is unclear what value this is adding. Controlling cost is the hard part. You may only need a cluster for 1 hour per day for a nightly aggregation job. Kubernetes clusters are not easy to provision and de-provision, so you end up paying for a cluster for 24 hour days and use it for only 1 hour. If someone comes up with a way to pay for pre-provisioned Kubernetes clusters only for the duration you use it that would be interesting. |
|
Regarding costs. By autoscaling the cluster size and minimising our service footprint, the fixed cost for using our platform is around $100/month, which is negligible compared to the cost of most big data projects. We have some ideas on how to drive this fixed cost to zero, and offer a free hosted version of our platform too. It's in the roadmap!