Hacker News new | ask | show | jobs
by sevensor 2509 days ago
I recently got the assignment to "do ML" on some data. I hadn't done anything in the area before, and a couple of things surprised me:

1. Most of your time is spent transforming data. Very little is spent building models.

2. Most of the eye-grabbing stuff that makes headlines is inapplicable. My application involves decisions that are expensive and can be safety critical. The models themselves have to be simple enough to be reasoned about, or they're no use.

You might argue that this means what I'm actually doing is statistics.

2 comments

The longer you work with ML, the more you discover that it's almost exclusively about handling data.

It's also one critique I have to the world of academia. When learning ML in academia, 9 of 10 times you work with clean and neat toy datasets.

Then you go out in the "real world" and instantly get hit with reality: You're gonna spend 80% of the time fixing data.

With that said, I think that 10 year from now, ML is going to be almost exclusively SaaS with very high levels of abstraction, with very little coding for the average user. Maybe some light scripting here and there, but I mostly just drag'n drop stuff.

> You might argue that this means what I'm actually doing is statistics.

whats the difference?

ML conferences have way bigger budgets.