|
|
|
|
|
by 1gn15
314 days ago
|
|
As with what others had said, this is less of a robots.txt and more of a sitemap. The issue with this is that website owners don't want to do this. Take Reddit removing the API for example. Everyone just switched to scraping the Reddit website instead for any remaining third party clients. Yes, APIs were supposed to be a compromise to lower the resources needed on both sides, but Reddit's stock price is linked to the value of "their" data, so... Alternatively, malicious website owners may make incorrect Aura files to mislead user agents. Then we're back to screen scraping as the ground truth, because behaving like a human is the best way to avoid discrimination. |
|