Hacker News new | ask | show | jobs
by Frotag 167 days ago
I've tried self-hosting with navidrome [0] / plex / jellyfin but the thing I miss most is music discovery via radios / discover weekly. I've tried replicating it a few times with embedding vectors + vector search but at best it finds songs in the (sub)-genre with the tempo / mood being pretty different.

Maybe I just need better data, been meaning to try again when that spotify crawl by annas-archive gets released. I've just been using musicbrainz [1] and youtube. Model-wise I've tried off-the-shelf ones like [2] and [3] and training auto-encoders like VAEs / MAEs [4]

[0] - https://www.navidrome.org/

[1] - https://musicbrainz.org/

[2] - https://github.com/LAION-AI/CLAP

[3] - https://github.com/SonyCSLParis/music2latent

[4] - https://arxiv.org/pdf/2207.06405

3 comments

i've found the essentia models to be pretty good for vector similarity. much better results than CLAP https://essentia.upf.edu/models.html#discogs-effnet

have a service running here if you're into electronic music and want to poke around https://cosine.club/

With the recent data leak of Spotify‘s entire playlist database, you might be able to build something considerably better for music discovery now!
This is the main factor that makes me leery to try going the full self host route.