Hacker News new | ask | show | jobs
by tixocloud 2621 days ago
We're building a deployment system based on the following principles that we've learned:

- Version control of models and datasets

- Code, naming conventions and formatting consistency

- Testing of models before deployment into production (in most cases, this helps gain credibility across engineering)

- Model review process with a senior data scientist

- Kubernetes/Docker for deployment serving as an API or running a scheduled job

- Model performance monitoring when in production to identify degradation

1 comments

Could you also give us some details about the software you specifically use in the pipeline, other than kubernetes/docker? Do you use any available (possibly open source)? tools for versioning and monitoring? Or are you building this all up from scratch for your needs?
We use Jenkins for our builds (and soon simple testcases), which then feeds into our system that's built from scratch. However, we are looking to productized the system as we had some discussions with other corporate partners who are interested in as well.

The system we've built in-house is fairly simple to keep development fast - versioning is manual (https://packaging.python.org/guides/hosting-your-own-index/) but no reason why you couldn't use a repository manager.

For monitoring, we essentially track any activity related to the model including inputs, outputs, timestamps, duration, etc. to a database and have JavaScript charts render. We might put this into Kafka but seems overkill at the moment and likely force us to hire an actual support team.