|
|
|
|
|
by shakna
3013 days ago
|
|
I don't know how they are doing it, but Google Scholar does not have an API, and scraping is against their TOS. > Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide. Despite this, there is scholar.py [0], which can extract files from Google Scholar, though it explicitly doesn't work around the rate limits. [0] https://github.com/ckreibich/scholar.py |
|
Unless this actually exploits something and hacks into Google's servers to get to the content, which would be something quite different, it wouldn't really be distinguishable from someone manually visiting the site in a browser, volume aside.
IMHO the pervasive attitude today of somehow requiring permission or an explicitly sanctioned "API" to access what is otherwise publicly accessible data is rather troubling for the freedom and flexibility of the Web as a whole. It encourages walled-garden content models and centralisation.