|
|
|
|
|
by identity-haver
2637 days ago
|
|
There was a claim [1] that the G+ terms of service might legally prohibit them from doing this after the service is shut down. I haven't verified it. However, it's clear that for an archiving effort this big, people at Google are explicitly allowing it. The user agents and fetch patterns of the Archive Team crawler were clearly distinct enough to get caught by an automated tool, and someone knew someone at Google in order to get it unblocked. Unfortunately, any archival effort that requires the "Warrior" crawler (and not just a guy with a 4TB disk) is at the mercy of the website's remaining staff and management. Just ask Soundcloud. Archive Team started to archive their stuff when it looked like they were going to go under, but Soundcloud shut them down. [1] https://news.ycombinator.com/item?id=19410050 |
|