Hacker News new | ask | show | jobs
by macrolime 1292 days ago
No. I released part of it as open source, but that was just a script to import Discogs to an SQL database. It was never really more than a proof of concept and I think it would be quite difficult to get the code running now. At the time one of the big issues I had was that I had no idea how to find the closest vector of a vector without it going like really slow.

I was also experimenting with finding a popularity score for songs and artists. The last.fm API worked well for this, but then that was just using an API, so there was a lot of other sources that I was looking into, like using pageviews on the Wikipedia article for an artist.

I've thought about making a new open source version at some point if I get time. I think I've got a decent idea how to make something work quite well now, basically make a vector with the genres/styles, do dimensionality reduction and then store in vector database, so you get like an embedding of the album essentially. A bit like those language models embed words in a vector space, but you don't need a neural network to do it, since that job is done already by humans who have listened to the music and tagged it.

1 comments

The Echo Nest published a song metadata collection 11 years ago: http://millionsongdataset.com/

They were acquired by Spotify in 2014 and according to announcements at the time, the intention was to enhance their recommendation/discovery system with the deeper insight they got into songs as a result (Infinite Gangnam Style, anyone?), but I don't know what came out of it.