|
|
|
|
|
by programnature
3581 days ago
|
|
While its useful to have this kind of info, IMHO its still far from 'infrastructure for deep learning'. What about model versioning? What about deployment environments? We need to address the whole lifecycle, not just the 'training' bit. This is a huge and underserved part of the problem bc people tend to be satisfied with having 1 model thats good enough to publish. |
|
If the data and models were small and training was quick (on the order of compilation time), I'd just keep the training data in git and train the model from scratch every time I run make. But the data is huge, training requires clusters of machines and can take days, so you need a pipeline.
An industrial strength system looks like this: https://code.facebook.com/posts/1072626246134461/introducing...