Hacker News new | ask | show | jobs
by greymalik 1193 days ago
Why is that your number one recommendation?
1 comments

because 90% of the industry work is MLOps the pipeline usually goes 1. make a POC inside a Jupyter Notebook with some scrappy, data and off-the-shelf model, define metrics and train a baseline to see if the whole ML endeavour might even be worth it 2. Do error analysis, find better data, tune parameters, re-train to see if you can improve upon the baseline 3. Make the first deployment, setup data collection 4. Automate 2 as much as possible because data is ever changing and you want to try many more off-the-shelf models 5. Deploy new models and collect ever more feedback

4 and 5 are basically a while loop that never ends and that's mostly MLOps It still requires proper ML expertise, especially when things break tho

we've got 4 and 5 pretty automated... the real issue is (as you likely are alluding to) as #1/2/3 draw in new completely infeasible data to get at scale, and then wants you to re-train daily, 2x day, every 4 hours, continuously. Oh and your costs go through the roof and likely aren't worth the returns anymore chasing that .001%