|
|
|
|
|
by rm999
4083 days ago
|
|
Meh. The more I do machine learning in industry the more I realize how little the ML part matters compares to everything else. A typical project I've seen takes 3-6 months and contains thousands lines of code, but the machine learning part will take a week or two and be 100 lines of code. What Amazon ML is doing would probably take an hour and 30 lines of R code you can easily find online. And here's the not-too-hidden secret: the ML part is the fun part. It's a big reason we spend months creating banking.csv. Josh Willis did a very funny presentation at MLconf partly about this. It's like waiting in line at a theme park for an hour, and then paying someone to cut in line at the last minute and record the ride for you.
https://www.youtube.com/watch?v=4Gwf5zsg4vI&feature=youtu.be... |
|
All these scenarios are difficult to debug because it's "statistical debugging". There are no breakpoints to put or watch windows to look at. There is no stack trace and there are no exceptions. Any Joe can train a model given training data, it takes fair bit of genius to debug these issues and push model performance to next level. Unfortunately all these new and old "frameworks" almost completely ignore this debugging part. I think the first framework that has great debugging tools will revolutionize ML like Borland revolutionized programming with its visual IDEs.