| I think it's just took us too long to write the product as the scope was really big: - CLI/UI. - Main backend for storing projects, user accounts, managing pull requests, forks, runners, deployers, loadbalancers. - Data backend (dotmesh). - Auto provisioning of VMs with jupyterlabs running and data synced to GCP, AWS - Runners that configure environment, install dependencies open up tunnels so users can access them and start working. - Optimized machine imagine builds so the startup takes ~1min (some of the docker images like jupyter lab are very big) - Model packaging into docker images. - Model metrics capturing (a proxy that runs as a sidecar and intercepts requests) and then attaching relevant classes for your models. - Kubernetes operator to deploy the actual models. User didn't have to worry about creating deployment manifests, services or ingresses (they wouldn't even care about docker images). They would just say which model to deploy and they would get a URL. Models could be deployed in a k8s cluster built from nodes with spot instances so would run pretty cheap :) - Last component that I worked on was probably one of the most fun - an inference router that could allow canary deployments for models and also shadow deployments where traffic is sent to many models at once but responses are taken only from primary. We got a really nice UI for this as well where you could drag sliders around to configure % of traffic and so on. Unfortunately never managed to write docs for this. - Terraform to wrap everything and deploy to GCP/AWS. Our team was always quite small so we were stretched thin. In the end sales were going well as well, probably 6 more months and we would have broken even and then profitable :) |