| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by seddonm1 1912 days ago

We provide multiple Docker images (https://github.com/orgs/tripl-ai/packages) that make the Spark deployment easy:

- arc-jupyter: allows you to develop on your local machine (and offline) or you can easily integrate it with a JupyterHub deployment on Kubernetes (https://zero-to-jupyterhub.readthedocs.io/en/stable/index.ht...). We have built JupyterHub on GCP Kubernetes (GKE) with full user-level auth via GCP IAM. If anyone is interested I can publish a secrets-removed version of our script.

- arc: is the execution only docker image (so is smaller than arc-jupyter). We have this orchestrated on Kubernetes too and now that Spark officially supports Kubernetes deployment it is actually really easy to create and destroy clusters on demand.