| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by endriju 3596 days ago
	I'm in desperate need of scalable Apache Spark cluster available through API that would make it easy to submit jobs that could process arbitrary size datasets but would let me abstract away the scaling part of the problem. I don't understand how there's nothing like that already considering popularity of Spark.

1 comments

Would Databricks solve this problem? https://databricks.com/

They are essentially Apache Spark-as-a-service and have an API that allows you to submit a job on a cluster that you can configure to autoscale: https://community.cloud.databricks.com/doc/api/#jobs.JobsSer... https://community.cloud.databricks.com/doc/api/#jobs.Cluster...