Hacker News new | ask | show | jobs
by Arubis 15 days ago
That's because the recommendation engine that Last.fm used back in the day was made the incredibly expensive way: the entire corpus was hand-tagged and cross-linked by humans atop an enormous CDDB. Last.fm, Audioscrobbler, and MusicBrainz (the association engine) were all linked together.
3 comments

The recommendations engine used them but it's main strength was it was primarily based on collaborative filtering (https://en.wikipedia.org/wiki/Last.fm).

Essentially if people who listen to many of the same artists/tracks as I do have discovered other things I have not, then those unseen artists/tracks become candidate recommendations.

It worked as well as it did because they had a user base of music fans with a wide variety of tastes. CBS ran them into trouble when they upset those fans by breaking the radio and by being perceived as too close to the RIAA.

The will need to get the numbers up, but I'm hoping them being independent again is a good sign.

> The will need to get the numbers up, but I'm hoping them being independent again is a good sign.

The problem will be recovering from algorithmic poisoning from folks just scrobbling from spotify

Just filter out Spotify entries. Scrobbles are tagged with the source.
Is that even a problem? If someone consumes a lot of algorithmic recommendations and you don't, wouldn't that drift you farther apart in the last.fm relationship?
If you really like Song X and Song X happens to be on a popular spotify playlist with a bunch of stuff you’re not into, you’ll start getting recommended all that other stuff on last.fm, no?
Well, the current last.fm "play your recommendations" is linked to spotify, so maybe you're right? Last.fm has gone through phases of no streaming, streaming, and partner streaming, and TBH I haven't used last.fm as a stream source in quite a while. I guess it seems possible that if they outsource their recommendation playback to spotify, you'll get spotify recommendations.

Outside of the spotify integration, last.fm doesn't have visibility into anything that isn't scrobbled AFAIK. It's based on user data only. You have "neighbors" who have similar tastes, which I think is calculated based on overlapping scrobbles (not sure if time-weighted, or just top listens). If we both start scrobbling with a limited amount of artists, and 75% of our scrobbles are the band Primus, we're probably going to be neighbors. If I decide that Primus sucks and start listening to Coldplay all day, our venn diagram overlap separates and we're not neighbors anymore.

Maybe the neighbors influence the recommendations, but playback is outsourced to spotify? I guess I don't really know. You can still browse neighbors though, and use their top lists as "recommendations", which should only be based on listening history.

I think pandora had the best recommendations - because it was based on real human input, not AI and not even "group think".

Sadly many years ago pandora bacame US only, so I couldnt use it (did not bother with VPN).

Yeah. 100% agree here. The Pandora algorithm was an absolute breath of fresh air relative to every other recommendation system. Just constantly surfacing new and interesting music that I had never heard of but was exactly what I liked. I spent so long hunting for info on how they achieved it and if their taxonomies had ever been open sourced.
The Pandora algorithm -- and Pandora's positioning -- is truly a product of its time. You'll want to look into the Music Genome Project (whose very name dates it; the Human Genome Project finished in 2003), but equally influential was the big labels' stranglehold on legal music distribution. When Pandora started and for years of profitability, they had no deals with major record labels, instead promoting small indie artists on the basis of their recommendation engine. If they were going to survive, that engine had to be _excellent_.
But Spotify has that as well. Tons of user curated playlists. And although user playback data is harder to parse through, it's also pretty straightforward to build some clustering algorithm where if you both like X then you might like Y as well.
My theory is that they don't have the incentive. Apple Genius was ridiculously good at music discovery, too. I shudder to think how much I spent on iTunes songs via genius over its run. But now that apple/spotify/etc get my monthly dollars either way, there's no huge incentive for them to create the discovery systems.
The incentive (for Spotify) is to make you consume the least amount of media, since every stream requires CDN and Bandwidth costs, plus royalties.
Spotify is pay to win (play) - especially user curated ones playlists.