|
|
|
|
|
by jrjames83
3081 days ago
|
|
I would recommend working on as many Kaggle competitions as you can. Try your best to understand manipulating data quickly and efficiently (use Python). Using the correct algorithm is typically trivial and a solved problem for most typical business issues (classification of text, imagery or audio data). At first you'll have to look at other people's code and copy/paste/edit/learn/iterate. Also, don't underestimate the complexity of creating good training, validation and potentially test sets. 95% of your time will be spent massaging data. People say it so often, it may sound ridiculous, but I promise you, it's not. This skill is also readily transferrable to other domains. I would start with classifying text using sklearn and Facebook's Fasttext. Then try the dogs/cats image classification challenge and get familiar with Keras and its utilities. I created a few hours of content around recognizing Bill Gates or Jeff Bezos, then trying to recognize 2 types of dog breeds. I outline the challenges of creating "good" training and validation sets. https://www.youtube.com/watch?v=O3hffX-jC98&list=PLImyDqSBQb... For every hour you spend looking at some equations, you missed an hour expanding your skill-set to manipulate and get data into a format which can be fed into well understood and maintained algorithms. Once you feel like you get produce results, go back and think about the underlying mechanics. My 2 cents from experience. |
|