Hacker News new | ask | show | jobs
by shabazp 3029 days ago
I have deployed ML algorithms into production including computer vision and data science models. From my experience, it's based on the application and cost we are comfortable with.

1. Keras/Tensorflow based algorithm(Applicable for any compute intensive or GPU-capable algorithm): Deployed the method (as a flask service) inside a Docker container along with a queueing system(for reliability w/ redis). We can now decide the server type and the kind of orchestration tool we can use to manage these containers. Following are some options for it,

    a. ECS on AWS
    b. Kubernetes 
    c. Docker Swarm
    d. custom orchestration tool
2. If your ML model is a simpler with a small (enough) model size, then using Lambdas on AWS would also work. This can provide high throughput and low cost per request if your computation time isn't very high.

Tips:

   i. Have the memory flush in the code after the service is used so that there is no memory leak. 
   ii. You can use tools such as htop to understand the memory usage.
   iii. Regarding system performance, you can use prometheus to gather stats along with grafana dashboard to view them.
I consider CI/CD essential for achieving a seamless workflow for a data scientist, and after having faced the same problems ourselves, our team and I have been working on Datmo, a tool to help companies more easily and cost-effectively deploy + manage their models in production.
1 comments

3. Another important component for CI/CD is the integration of tools like Airflow in order to schedule and monitor workflows. This helps us in deploying newly trained models.