Hacker News new | ask | show | jobs
by koheripbal 162 days ago
To add a little context, this suspension comes immediately after Anna's Archive publicly implicated themselves in the Spotify scraping "hack" in which they downloaded nearly the entire content library of Spotify and was preparing to release it publicly (~300TB worth) via torrent.

They published a blog post outlining their plans.

4 comments

Did the operators _want_ to poke the well connected & well funded bear with a historical anger problem?
No, but they weren't not going to, given that their mission is to archive all cultural content, by hook or by crook.
Archiving it and publishing it are different things.

More importantly, they may sabotage their mission: If Spotify shuts them down, their exiting archives and especially future archives may be effectively lost.

I guess I should say more accurately: Their mission is to both archive it and publish it. They seem to be explicitly against copyright, on principle. Which I greatly respect.
It's time to abolish copyright. It creates more problems (stiffles innovation, creates rents) than it solves (rewards innovation).
It doesn't create problems for large companies that make AI systems.
Fortunately, Spotify does not have that power. Annas Archive is not based in US or EU jurisdictions. They can make access for normal people a bit harder, but not shut it down.

(Edited for clarity)

> Fortunately, Spotify does not have that power. They are not based in US or EU jurisdictions.

Perhaps I misunderstood something, but according to my understanding

1. Spotify is registered in Luxembourg and has its operational headquarter in Sweden (Stockholm). Both are EU countries.

2. I guess it won't be Spotify that sues, but the individual music labels (very likely united).

Annas archive is not based in the EU (sorry for being not clear). So the law in EU is limited to enforce a ban. In germany it is already "banned" via ISP but just DNS.

But the real servers are hosted in kazachstan or russia I think. And they do not cooperate so much with EU courts.

So unless the EU installs a great firewall like china, they cannot really shut it down.

Presumably the opposing party is residing in non-US-or(and? depends on the order of evaluation)-EU territory, but I might be mistaken. "They" refers to both sides in the parent comment.
I'm not sure archiving and publishing are different things.
They are, but archiving without publishing is pointless.

I occasionally wonder how many enormous collections of culture like that of Marion Stokes[1] have been lost because their curators made no effort to realize the value of their collection.

1. https://en.wikipedia.org/wiki/Marion_Stokes

Most archives - the ones in libraries, etc. - are not published, except they are available to qualified people who physically travel there. Most are not even fully indexed - nobody knows all of what's there.
> They are, but archiving without publishing is pointless.

One may collect/archive now (when the data is, well, "available"), and publish later, when copyright expires and the material will likely be harder to obtain.

I can save a copy of my friend's book on my computer, archiving it. Nobody else could see it unless I publish it.
They stated that they would pass the information on to other archivists and public/private trackers no? They obviously have backups, since there are multiple users seeding Gbs and even TBs of data. Mirrors can be created as well, like TPB.
No, because they are all backed up on torrent. Good luck, getting those "shut down" from the DHT
They didn’t come anywhere close to the entire content library, the 300TB represents about 33% of Spotify, though it is close to 100% of the played music.
Kind of nuts that 66% of their library is virtually unplayed. It’s hard to make it as a musician.
It is ridiculously easy to create an album with Suno and push it Spotify. I'm surprised its only 66% TBH
Anna's archive has a great analysis of the Spotify data.

They identify a huge surge in tracks that few listen to after gen AI started.

The analysis is worth reading. The distribution is (Pareto)^3 ~99% of the tracks played are 1% of the catalogue.

1. Generate slop music nobody will ever listen to 2. ???? 3. Profit
It's actually:

1. Generate slop music no _human_ will ever listen to

2. Use a botnet to "play" this music en masse

3. Profit

This is a whole arms race, with companies (such as Beatdapp) specializing in detecting fraudulent plays.

Source: I work for a niche music retailer that struggles with the same issues on a smaller scale.

From a stat I saw years ago, about the same amount of apps on the iOS app store have never been downloaded.
To be completely fair, I am not certain what it means for a track to be "virtually unplayed".

First off, it was striking to me how little of the "top 10 000" they published back on Christmas I recognize. I'm not sure what I expected, but 10 000 sounds like a big number, so it seemed likely to me, that if I get a random song from my playlist I could find it there. It turned out I hardly can find an artist I recognize. Ok, I can recall a song from Lady Gaga and even Billie Eilish, I've heard of Bruno Mars (cannot recall any song), but I have no idea what is "Bad Bunny", "Doechii", "Drake". I mean, I think I do have a pretty good idea what these things are (abstractly), and I probably wouldn't want to listen that. And while I knew that all this stuff is very popular, I didn't quite realize how little place in the top-10000 it leaves for the music I (and everyone I know) actually listen to.

I didn't download the metadata they released (it would be hard to process it on my laptop anyway), but now I wonder how much of my 3 TB music collection is in top 100 000, or heck, even top 1M Spotify, or on Spotify at all.

I also am sometimes surprised how little scrobbles some tracks get. I didn't bother to find out what this means, how many people still scrobble to Last.fm or ListenBrainz, but it is just surprising when I see that a track that I didn't consider to be obscure was scrobbled like 50 times this year.

So I'm saying that music worlds seems to be terribly fragmented, even more than I imagined. So the very premise of AA backing-up 97% of Spotify (by the number of plays) may be much lesser achievement at "preserving culture" than it may sound. And of course we are about 8 years too late to backup everything, since by now half of it must be generative NN bullshit. And I'm not even sure it's in those leftover 3% (bots listen to bot-generated music too, right)?

> It turned out I hardly can find an artist I recognize

I've heard of 9 of the top 10 and 15 of the top 20 at https://chartmasters.org/most-monthly-listeners-on-spotify/

You might not listen, but surely you have heard of Taylor Swift, Justin Bieber, Ariana Grande, Ed Sheeran, Coldplay and of course Christmas Staples of Mariah Carey and Wham?

First off, this is not the top we are talking about, since there is one that AA provided[0]. I am not sure what it matters which names exactly I've heard of, but if you are that curious: I don't know what is Ed Sheeran and Wham (but cannot vouch I've never heard their music in a supermarket), but I definitely remember "Coldplay" being mentioned in a joke onstage by a NIN member[1], but I didn't bother to check out what they are. I can imagine the faces of Taylor Swift & Justin Bieber, but cannot name any song, and I'm sure I've heard Mariah Carey somewhere, since that name is around longer than Rihanna. I have a song or two of Ariana Grande in my playlist though.

Edit: Ok, I've finally googled "Coldplay". Yeah, definitely heard "Clocks" somewhere.

[0] https://annas-archive.li/blog/spotify/spotify-top-10k-songs-...

[1] https://www.youtube.com/watch?v=qboe5CebixA

You're a (waaay) outlier.
The funny thing is, since the advent of streaming I no longer listen to the radio. I listen to new music, but little pop music, and I have never heard a single track from Swift, Bieber, Grande or Sheeran. Coldplay is the only act I like on that list, and the streaming services are pretty good at only playing what I like.

If they were pre-streaming artists I probably would have heard a lot of their catalog because radio played it over and over. Unfortunately you just can’t get away from the Christmas music.

Sure, but I'm sure you've heard of Taylor Swift and Justin Bieber.
Traditional radio mostly sucks, but Soma.fm and KEXP are both great for discovering new music.
Very hard if you have little talent..
> For now this is a torrents-only archive aimed at preservation, but if there is enough interest, we could add downloading of individual files to Anna’s Archive. Please let us know if you’d like this.

If it is torrents only, what relevance does unregistering the domain make?

Ideally, if AA doesn't have any public web presence it's a lot harder to publicly disseminate those torrents.

Realistically, it's just a way for someone to say something is being done about this, even if it's not going to actually make a difference.

Establishing a position Anna's opponents may consider an advantage.

And there is a site idea!

Annasopponents.news --> Can inform passersby on anything related to Anna's Archive along with activism related material, how to's and the like.

Yeah, obviously I don't know if it is actually related, but my first thought when I couldn't open it today was "Told you so"...
Spotify was created from a library of pirated music.. the irony
Came here to say that.

An while back, another site started with a pile of pirated music, and that was allofmp3.com Remember those peeps?

Their business model was to sell music by selling bandwidth. Basically is was all the music you want charged by the megabit download.

Pop titles were $0.10 to $0.25. A whole album at 256mbps was roughly $3 give or take.

What got me really thinking was how great the UX experience was. At the time, few came close.

The end of that site was packaged up with Russia's entry into the WTO.

I seem to remember hearing about huge torrents out there too. The right infohash can point a person to huge archives of various kinds, books, video, academic papers, music, the WikiLeak insurance files, which is password protected, as perhaps all of these are.

As someone who grew up poor in an ex-Eastern Bloc country, allofmp3.com was a godsend.