| What didn't work: Shipping pickled models to other teams. Deploying Sagemaker endpoints (too costly). Requiring editing of config files to deploy endpoints. What did work: Shipping http endpoints. Deriving api documentation from model docstrings. Deploying lambdas (less costly than Sagemaker endpoints). Writing a ~150 line python script to pickle the model, save a requirements.txt, some api metadata, and test input/output data. Continuous deployment (after model is saved no manual intervention if model response matches output data). |
It packages your model for you into a standardized format, that you can use it in multiply serving scenarios online serving with api endpoint, offline serving with spark udf, CLI access or import it as python module. It also helps you deploy to different platform such as lambda, sagemaker and others.
Our value is from model in notebook to production service in 5 mins. Love to hear your feedback on this. You can try out our quick start on Google colab (https://colab.research.google.com/github/bentoml/BentoML/blo...)