because 90% of the industry work is MLOps
the pipeline usually goes
1. make a POC inside a Jupyter Notebook with some scrappy, data and off-the-shelf model,
define metrics and train a baseline to see if the whole ML endeavour might even be worth it
2. Do error analysis, find better data, tune parameters, re-train to see if you can improve upon the baseline
3. Make the first deployment, setup data collection
4. Automate 2 as much as possible because data is ever changing and you want to try many more off-the-shelf models
5. Deploy new models and collect ever more feedback
4 and 5 are basically a while loop that never ends and that's mostly MLOps
It still requires proper ML expertise, especially when things break tho
we've got 4 and 5 pretty automated... the real issue is (as you likely are alluding to) as #1/2/3 draw in new completely infeasible data to get at scale, and then wants you to re-train daily, 2x day, every 4 hours, continuously. Oh and your costs go through the roof and likely aren't worth the returns anymore chasing that .001%
4 and 5 are basically a while loop that never ends and that's mostly MLOps It still requires proper ML expertise, especially when things break tho