How to deliver on Machine Learning projects | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	How to deliver on Machine Learning projects (blog.insightdatascience.com)
	162 points by jakek 2819 days ago

7 comments

vjsc 2819 days ago

So we had this idea of a new feature for our product. The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted. Viola!! It seemed simple.

Now our company doesn't have any machine learning expert or a data science genius. Going for hiring one would take time. Taking someone up on contract would be very expensive (our CEO wasn't ready to shell out that kinda money). So the task fell on me. They asked me to go through the multitudes of Machine leaning MOOCs out there and get a working prototype ready in 2 weeks.

I had already done Andrew Ng's course back when it came out for the first time. But my memory had faded for the lack of practice.

I re-ran the course again. I went over a couple of online ML books too.

Then I started thinking of the problem at hand. Unfortunately, it turned out to be a chicken and egg problem. For the feature to work perfectly we needed a large amount of training data to train our models. But without the feature actually deployed, we didn't have any way to collect any training data.

So we ultimately fell back to simple algo, that took it's decisions based on a few hard coded rules. Things have been working fine till now.

hellogoodbyeeee 2819 days ago

They gave you two weeks to become a data scientist and implement a working solution? That's nuts. I'm still pretty early career, but I have done data science work for about four years now and I wouldve quoted at least two months to figure out data, clean it, feature engineer, run models, compare results, and then deliver the best performing solution.

tedivm 2819 days ago

And they didn't even have data!

pletnes 2819 days ago

No data cleaning required. That’s often 80% of a project. So 2 months -> 2 weeks makes sense now!

JHonaker 2819 days ago

No data is better than 10 years of useless data. I’d much rather be in the position of designing the data collection (experimental design ftw) than trying to fix the problems with an overly complicated modeling project. Buuut, I am a statistician.

In my experience, having someone that knows what they’re doing on the front end of a study design wise can save weeks or months of work on the back end of a study or project.

speby 2818 days ago

> They gave you two weeks to become a data scientist and implement a working solution? That's nuts.

Oh c'mon. Any large company today and the expectation or deadline for practically anything is "asap" or measured in a few weeks at most. Short-term thinking is a major player in publicly traded companies. Because of that, this is what opens the door for startups to play the long-game.

ellisv 2819 days ago

> Unfortunately, it turned out to be a chicken and egg problem. For the feature to work perfectly we needed a large amount of training data to train our models. But without the feature actually deployed, we didn't have any way to collect any training data.

Everyone outside of data science seems really surprised by this and I can't count the number of times someone has asked me to build an algorithm for X but has none of the data to support doing so. It doesn't mean the feature/product can't be built but they often want a supervised learning solution without the cost (and time) of acquiring the ground truth data.

superflyguy 2819 days ago

"The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted. Viola!!"

Designing the perfect viola using machine learning doesn't sound like it's something for beginners.

MasterScrat 2819 days ago

Yes, sorry to be "that guy" as well, but it's voila ("voilà" if you want to be pedantic).

"Viola" either refers to a stringed instrument, or means "raped" in the sense "he raped" ("il viola"). So please don't use it as an interjection.

S4M 2819 days ago

I think everybody understood that "viola" was a typo.

zwieback 2819 days ago

Just take a violin and scale by a factor of 1.2 or so.

idontpost 2819 days ago

I feel like the devil is in the at "or so" part.

dragandj 2819 days ago

> The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted.

Since no one of you had any experience with ML, how did you know that a ML algo (which one?), implemented "somehow" would give you the results you wanted? (Not a cynical comment; I am really interested in hearing about this).

jimcsharp 2819 days ago

Not OP, but went through a similar situation and the feature was 'alert us about unintuitive correlations in our data so we can invent new KPIs'

e_ameisen 2819 days ago

Co-author here. This is a surprisingly common situation. In fact starting with the simplest algo is usually the best way to prove the validity of your approach, and gather initial data to build a more complex model later.

In addition, trying for the feature to “work perfectly” from the get go, even with lots of data usually is quite hard.

probably_wrong 2819 days ago

Maybe it's an instance of "when all you have is a hammer...", because I'm learning about it right now, but you could look into transfer learning - you train a ML model in a similar, easier task, and then you tweak it with your data.

That said, there's a good chance that your current algorithm is all you will ever need - many times a ML project is too much, and you already have good results.

minimaxir 2819 days ago

Transfer learning only works if the original model is in the same domain (e.g. ImageNet for images, GloVe for text). A bespoke problem likely won't have a widely-available original model.

sidr 2819 days ago

I like this resource a lot for new data scientists: https://developers.google.com/machine-learning/guides/rules-... . Rule #1 seems pertinent to your situation.

bonniemuffin 2819 days ago

That seems fine to me. It's a good practice to start with hard-coded business rules instead of any kind of model, just to test the waters, collect some data, and see if a new feature even makes sense, before diving into building even the simplest model.

chippy 2819 days ago

I've been talking to academic neural net / ML experts in computer vision and OCR / NLP and the thing they try to stress is that for almost all cases an algorithmic approach works better.

yonkshi 2819 days ago

I don't think most ML experts would agree with that, a big reason DL became popular are the huge improvements they brought to CV and NLP fields.

In many ways, traditional approaches were harder because you need huge amount of domain expertise in CV & NLP, whereas a ML expert can solve simple CV problems with almost no domain knowledge.

Now, a lot of the business data, especially time series data, I agree that an algorithm/heuristic approach is easier and more robust. E.g. recommendation systems.

pedrosorio 2819 days ago

"traditional approaches" in CV & NLP were also ML (a quick reminder that machine learning existed long before the deep learning hype).

Not sure what the parent meant by "algorithmic approach" though.

claytonjy 2819 days ago

yes, but before the ML step the old approaches relied on expert-crafted features. The breakthroughs in those fields via deep-learning is because people found architectures (CNN/RNNs) that could learn those features much, much, much more efficiently than they could be hand-crafted.

sgt101 2818 days ago

Well, if you tune the hyperparameters and architecture just so....

aaronblohowiak 2819 days ago

While not ML it still would have been considered a form of AI back in the day — “expert systems” they used to call it :)

fromthestart 2819 days ago

Machine Learning is much more nuanced than people seem to understand. You can't just throw data at a net and expect results-this field requires a heavy degree of intuition, and engineers must be prepared for nets to pick up on patterns not obvious to humans, which can lead to unintuitive results.

Neural nets are basically black box heuristics, with unpredictable edge cases. Much like human reasoning, I'd warrant!

b_tterc_p 2819 days ago

this doesn’t seem to offers any novel perspectives. I read it as intended for self marketing.

e_ameisen 2819 days ago

Co-author here. This post came out of a discussion with Adam, where we both realized that the advice we were giving to ML teams and ML Engineers to guide them to better results were very often process centric rather than model centric.

Many resources exist online about how to get a model to converge, and that’s not usually what makes or break a project.

Data acquisition, augmentation, model selection, and iterative exploration however seem quite rarely discussed compared to how important we have seen them be. This is our attempt at sharing this outside of our usual circles.

bigmanwalter 2819 days ago

Platitudes on platitudes. "Be efficient." "Do it right."

ParanoidShroom 2819 days ago

Isn't that the goal of every company their Medium Blog :')

That doesn't mean it's bad.

DrNuke 2819 days ago

Novel per se means nothing, for business the more the standards the better. In ML/DL for b2b we badly need unified best practices and, above all, sensitivity (ablation) protocols to demonstrate our models are neither overfitting nor cherrypicking.

AlexanderNull 2819 days ago

I hate quotes but there's a single one I'll ever use because it's not only accurate but incredibly useful: "People need to be reminded more often than they need to be instructed."

seren 2819 days ago

That sounds awfully close to DMAIC.

https://en.wikipedia.org/wiki/DMAIC

Nothing wrong with that though...

sgt101 2819 days ago

So we do the loop 50 time and we now have an algorithm that works (97%!) on the test set. We are happy! We run it in production and everything looks good (prbly 92% ish). Everyone is happy! We all get promoted or get new jobs. Then, one day, someone actually looks at what it's doing... and lo. It. does. not. work (~51%) Everyone is sad. Apart from us! Yay!

Seriously - an optimisation loop on a test set? Seriously?

rfeather 2819 days ago

The point about hacking away at the code needs to be couched heavily. It's too easy to conclude you've got negative or positive results when what you really have is a silly little bug. The lack of focus on implementation skills in data (or even "real" science) is frightful. The one take away anyone trained in software engineering could share is that if you aren't very sure if it is working as intended, it's very likely not. Code review is very applicable here when making major pivots, even if unit or other testing is decidedly too time consuming for the train test improve loop.

Edit: typo "of" to "if". Somewhat serendipitous if you think about it.

reureu 2819 days ago

I love that "Data Scientist" has become such an inflated and meaningless title that now we have "Machine Learning Engineer".

ende 2819 days ago

Well, “Data Scientist” has been appropriated by the overflow of PhD’s w/o any actual stats or computational backgrounds and few academia prospects, so I guess you need to create new job titles for thise who are going to do the actual work.

reureu 2819 days ago

I totally agree, and wasn't arguing that a new title wasn't necessary. And I'm ok with my downvotes for that comment :)

It's just funny that "Data Scientist" seemed to be originally branded as the more technical/engineer-y version of a data analyst. Now I get recruiters contacting me for "Data Scientist" positions that entirely revolves around SQL and excel, and nobody in the Bay Area hires "Data Analysts" anymore.

Alright, guess it's time to update my LinkedIn and resume to adjust for this inflation? Maybe I should jump up a few inflation levels and just become a "Deep Learning Engineer."

borroka 2818 days ago

I do not see any problem with that. There is a ton of confusion in the tech world regarding labels, who does what, it is needed or not, outside of the core actions that need to be done. The net effect of laying off 50% of tech people from public tech companies might even result in a net positive for the companies. Not for a tech worker like me, so please do not tell them.

Taking advantage as much as possible of hypes and other people's lazyness is fine in my book. It is certainly not my duty from the outside to educate recruiters and business people who make hiring decisions on the field – when I tried, from the inside, to gently point out that what they were thinking did not make any sense, I just put myself in a dangerous spot. I can be a data scientist, deep learning engineer, machine learning engineer, machine learning research scientist, whatever pays more and whoever has the most fun. If using an RNN instead of a more effective and efficient linear regression gives me more money and prestige, I will do it – as an IC you either go with the flow or you are not having a good time. The vast majority of us is not saving lives anyway.