|
|
|
|
|
by sandGorgon
2783 days ago
|
|
Interesting. In that case, why do you even use Docker ? Does it simplify distribution of models easier ? Would love to know more about your packaging setup - the branch name to divide datasets is a nice trick (I'll use it as well). How does your CI know where to find models ? Im betting you are using some kind of convention here - one model per py file...so package each py file in a docker container. If it is possible, would love to see the skeleton structure of one of your pre-packaged files. Tldr - it seems you invented something like pyml as well. Are the deployment scripts+model skeletons open source ? |
|
In the ML projects, it serves mainly to package dependencies, and to ensure some basic security constraints: raw datasets are accessible read only, ensuring that if we suspect some issue with cached results (cause our inner orchestrator is Make..) we can nuke all the results and start over from scratch, sure the raw data is intact.
The models and arguments are in the CI config. No magic there, but since it’s all in the repo I’m ok with it.
This whole setup was put together for an upcoming clinical trial as steps toward ISO quality norms compliance, and I can’t share it now. I do intend to reproduce it in an open form alongside our existing software (GitHub.com/the-virtual-brain) when it’s ready.
In any case I appreciate your questions a lot: they drove me to think a little harder and see why stuff like Michelango and PyML is stuff that even we (academic/clinical) group should be using... if we can find the time to do it.