Hacker News new | ask | show | jobs
by nlplaylist 1245 days ago
OP HERE! Solved the hugging to death issue. Reached the Spotify 11k playlist limit. Deleting old playlists and the site should be working as intended now.

THANK YOU whoever posted this!

Currently working on: Fixing the recommendation scoring function. Right now it's giving hit or miss responses. I think the problem is with my cross encoder "reranker" is not doing its job the right way. I'll fix the passages it looks at when re ranking based on the query.

Also getting rid of the input box animation, lol. I've gotten flack for that on Reddit too. You should have seen the old site. It still renders that HTML on mobile.

Taking any and all questions!

5 comments

It's still hugged-to-death so I can't see it, but based on responses, this looks extremely cool!

I know that monetization is a sticky topic for many people, so please don't be offended if this isn't something you're looking for, but I have a bunch of contacts in a major music streaming service's recommendation and personalization teams - if you'd like an intro to discuss this with them (either selling the idea/implementation, or using it as a resume-item for a job), drop me an email (in my about box).

I’d love to know at a high level how you went about implementing this. Is it just using OpenAI’s built-in music knowledge? Did you do any of your own classification?
TLDR: Lots of musical metadata converted into paragraphs and the SentenceTransformers Retrieve & Re-Rank Pipeline.(https://www.sbert.net/index.html)

The sentence embeddings are calculated using a Bidirectional Encoder Representation Transformer (BERT) model. There's a pre-trained model for this network trained on over 1 billion sentences from the internet that is publicly available, (thanks Microsoft) . The model transforms your description into a 784-long list of numbers (a vector) that represents the contextual meaning of your sentence.

The model runs off a dataset of musical metadata for 35,000 songs. As a "chronically online music nerd", I knew where to find it. The metadata is very rich, it has a lot of useful columns like the genres, subgenres, and descriptions of tracks. The numerical data is binned into categorical values like "obscure" mapping popularity between 0 and 10, "highly danceable" mapping danceability between 80 and 100, etc. The text data is modified into a coherent sentence: "this song's main genres are _____. this song is from the 80s. this name of this song is lovefool by the cardigans. etc"

An arduous part of the project was describing each musical genre in depth, with its own paragraph such that each genre's actual contextual meaning is captured and not just "This song is a Hyperpop song" or "This song is Adult Contemporary". It was a big exercise in music history and tested my knowledge of music. I also learned a lot about musical genres like "Mongolian Throat Singing" and how it compares to "Gamelan Throat Singing".

I also put the song lyrics for each song through GPT-3 and asked it to summarize the lyrical themes. That's also embedded and used in NLPlaylist.

Each feature for each song in our metadata dataset is now a big paragraph that describes the song. The paragraph is split up into sentences, and the embedding of each sentence is found. The final embedding for each song is then calculated by taking a weighted average over all sentence embeddings from the big paragraph and genre and lyrical embeddings.

To make your playlist, all that has to be done is compare the embedding of your query all 35,000 embeddings in the dataset and return the 100 most similar queries, using the cosine similarity distance metric. Thank god we have computers.

Once the 100 most similar candidate tracks are found, they are reranked using a "cross encoder" trained on 215M question-answer pairs from various sources and domains, including StackExchange, Yahoo Answers, Google & Bing search queries to give the best matches.

Incredible work, and great explanation, thank you. Could you comment more on the cost of running this (e.g. per query or per hour, or however it's set up)? Where are you running the model from?

Also, could you provide more details on the cross encoder used for reranking?

Btw, if you already have all these song embeddings, it would be very interesting to be able to pick a song, and get all the similar ones in a playlist (sort of like "Song radio" on Spotify)!

Having worked with the same tech, I'd assume this is pretty inexpensive both to train and run. Probably in the low 1000s (maybe even mid 100s?) to train. knn search on 35k entries is pretty simple. The most expensive is probably the cross encoder (both to train and to run). Would also be interested to know
That's awesome, thank you so much for sharing all that. The project is great.
Very impressive. I'm running into an issue where if I try to tell it to exclude yacht rock, it gives me yacht rock. Is that hard to train?
I haven't tried it yet, but this is exactly how I've always wanted to have playlists generated, so I'm looking forward to trying it! Auto-generated playlists are sometimes great, but IMO they often miss the mark of what I wanted or why I like a particular song. So I think this kind of playlist generation could help solve this, especially if the playlist can be refined after initial generation.
Make this an Alexa skill
It's still dead.