| Honestly? The exact problem I'm dealing with at work right now. We're trying to re-write our recommender for artist music stations at iHeartRadio (aka "I'll listen to Drake or Kendrick Lamar's station at the gym today"). Just today, I tried adding negative sampling to the matrix I'm factorizing, hoping it encourages spread in the embeddings learned for artists in certain types of genres. I have a MS, but not a lot of research experience. It would have taken me a while to find this solution on my own. However, the moment I described this problem to my manager - a PhD graduate with several years of research and industry experience - he immediately suggested negative sampling. What I learned during my MS helped me grok the math immediately. We're adding noise to the training set and penalizing vectors lengths to avoid overfitting. Easy! Identifying a solution worth exploring? Not easy, at least without a degree or significant experience. (There's also the chance I should know this, in which case I have some reading to do. ¯\_(ツ)_/¯) |
I am a good ways through my masters (second CS degree, first specializing in ML), and the more I learn, the more I realize that on any given topic, there is no guarantee the PhD in the room has the most expertise. Machine learning is a broad field that contains many subfields, methodologies, and many applications. It is a bit like computer systems or software engineering: nobody knows it all, people who are experts have intimate knowledge of a specific subset of the field. Of course, you can more around over time, but it takes years to build up expertise in even two or three subfields of machine learning.
Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.