Hacker News new | ask | show | jobs
by raffraffraff 5 days ago
I'm working on a recommendation service (which, to me, it's the piece I'm missing when I play my local mp3 collection)

I collect song metadata from various places (genre, instruments, track credits, rating). I also scrape charts by year, genre etc.

Then I run an ETL job on the json data I have downloaded, pre-building queries for extremely fast lookup tables. This gets saved to Duckdb, which is used by my go web ui/api.

It's very early days, and I only spend one or two hours a week on it, but right now it's amazingly useful. It had roughly 80k song metadata. To preview the suggested songs I ended up building a very cut-down YouTube music player, except that the playing song has all the metadata right there, and everything is a link that can take you to the artist, composer, instrument, genre, album etc. It's a great way to "wander through your collection".

Unfortunately this is only useful to me, because I targeted the music I listen to.

Next step is to download lyrics and extract song meaning, keywords etc. Then use MusiCNN, (or CLAP,OpenL3, HTSAT) to extract embeddings. Finally train my own model for nearest-neighbor retrieval based on a mix of metadata, giving the user the ability to tune it on the fly.

1 comments

Did you ever have to pass Appstore review process? How do they look at copyright and stuff when you are publishing an app that plays your local mp3 collection (how does your mp3 collections ends up on your phone?)
Right now mine isn't an app (yet) it's the backend service (api and web) so I just run it in a browser.

For now I'm letting that backend access my files directly. The front end can also play YouTube music (free, using the yt-dlp method).

None of this is public yet so right now I don't need anyone's approval.