| I think i should elaborate a bit more. ( i am pharaphrasing what i've seen/built for customers working on real world ML projects ) Let assume for a task of building a vision model like tesla for autonomous driving, basically taking camera feeds and turning them into 3D geometry. For that you have to: 1. Collect Data 2. Curate Data 3. Augment Data ( i don't mean classic image augmentation techniques, but for example connecting a simulator to provide artifical data samples ) 4. Label data 5. Define your model / or figure out a good model with presets/auto ml techniques 6. Train it at scale 7. Analyze/Test your model, think unit testing but for ml 8. Optimize it to run on edge hardware 9. Deploy it and distribute All of that with proper A/B Testing, different models, and continously improving/tweaking and adding data. THere are literally 100s of tools in that space, covering one or multiple steps. But nothing that integrates the whole. It still has an incredible lead time / engineering effort to setup / build a pipeline like that and run it at scale, handle the workflows behind it and also be able to run on premise ( using cloud resources for a lot of that is both a no go for many large companies due to security concerns and cost ) Some cloud SAAS software comes quite close e.g Google Vertex Ai, Sagemaker etc..
But they still fall very short for a production pipeline. |