| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by claytonjy 3024 days ago

I don't work on computer vision specifically, and can generally suffer a few seconds of latency, but we have a two-stage process that I think applies fairly generally

1. all lowish-level production code (importing, transforming, modeling) is written in packages, which are thoroughly unit-tested with a CI system

2. that code is wrapped up into docker container(s) in separate repositories, which are built and integration-tested with CI. In addition to the Dockerfile and any testing scripts, there's usually a single code file here which handles I/O specifics, API endpoints, and primarily calls code from the package

This works well with R or Python and should work with others; we use Gitlab for the free private repos and awesome built-in CI.

This doesn't cover where the data or models are stored, but that varies more per-project for us. Lately we've been using Pachyderm and loving it, but you can get pretty far with a postgres instance for data and storing trained model objects in S3/GCS.