|
|
|
|
|
by ribadeo
536 days ago
|
|
How do you know what the contextual configuration of their robots.txt is/was? Your accusation was directly addressed by the author in a comment on the original post, IIRC i find your attitude as expressed here to be problematic in many ways |
|
For convenience, you can view the extracted data here:
https://pastebin.com/VSHMTThJ
You are welcome to verify for yourself by searching for “wiki.diasporafoundation.org/robots.txt” in the CommonCrawl index here:
https://index.commoncrawl.org/
The index contains a file name that you can append to the CommonCrawl url to download the archive and view.
More detailed information on downloading archives here:
https://commoncrawl.org/get-started
From September to December, the robots.txt at wiki.diasporafoundation.org contained this, and only this:
>User-agent: * >Disallow: /w/
Apologies for my attitude, I find defenders of the dishonest in the face of clear evidence even more problematic.