|
|
|
|
|
by Cynddl
691 days ago
|
|
I find it interesting that as an (edit: UK) academic researcher, I would be likely be forbidden to use tools like this, that fail basic ethics standards, regulations such as GDPR, and practical standards such as respecting robots.txt [given there's no information on embedding.io, it's unlikely I can block the crawler when designing a website]. There's still room for an ethical development of such crawlers and technologies, but it needs to be consent-first, with strong ethical and legal standards. The crazy development of such tools has been a massive issue for a number of small online organisations that struggle with poorly implemented or maintained bots (as discussed for OpenStreetMap or Read The Docs). |
|
Because if you save the pages you browse on some site, they're yours (authors don't own your cache).
Perhaps you're arguing that if you wrote a lightweight script/browser (which is just your user agent) to save some website for offline use, that'd be unethical and GDPR violating? Again, I don't think so but maybe I'm missing something. But perhaps this turns on what defines a "user agent".
Perhaps this becomes a "depth of pre-fetch" question. If your browser prefetches linked pages, that's "automated" downloading, akin to the script approach above. Downloading. To your cache. Which you own. (Where I struggle to see an ethical violation)
Genuinely curious where the line is, or what exactly here is triggering ethics, GDPR and practical standards?