| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by yid 4256 days ago

While there's some truth in what you're saying, you sort of demonstrate a very common pitfall:

> Tuning parameters is basically a gridsearch. You can bruteforce this. In goes some ranges of parameters, out come the best params found.

This sounds so simple. However, if you just do a bruteforce grid search and call it a day, you're most likely going to overfit your model to the data. This is what I've seen happen when amateurs (for lack of a better word) build ML systems:

(1) You'll get tremendously good accuracies on your training dataset with grid search (2) Business decisions will be made based on the high accuracy numbers you're seeing (90%? wow! we've got a helluva product here!) (3) The model will be deployed to production. (4) Accuracies will be much lower, perhaps 5-10% lower if you're lucky, perhaps a lot more. (5) Scramble to explain low accuracies, various heuristics put in place, ad-hoc data transforms, retrain models on new data -- all essentially groping in the dark, because now there's a fire and you can't afford the time to learn about model regularization and cross-validation techniques.

And eventually you'll have a patchwork of spaghetti that is perhaps ML, perhaps just heuristics mashed together. So while there's value in being practical, when ML becomes a commodity enough to be in an IT stack, it is likely no longer considered ML.