|
|
|
|
|
by sneak
2295 days ago
|
|
If you think developing web spider software is akin to developing nuclear weapons, I think you might want to go have a talk with some larger, well-known companies who have not only half-developed not-yet-working software (like my activitypub spider, which doesn't even have a storage backend at the moment), but who have fully developed advanced web spiders that have actually downloaded and archived exabytes of data from the web, to be saved privately for all time. Frequently they even let anyone who wants search the full text of it, usually without authentication! If you don't want second parties to have copies of your data, configure your webserver not to send it to them when they request it. You can't force someone to do something with an HTTP request. |
|
Anyway, I work at one of those companies. You know what they have? Ways to let users opt out (ex: ROBOTS.txt), ways to ensure they're not DOSing people when scraping (which uses material resources: compute time, spindles, electricity, etc), ways to track the copyright of the source material (which belongs to the author, usually), and ways to respond to second-party requests (legal and non-legal notices) who want to know how much of their data has been scraped or exercise their rights over their material. These technological features are because this is what human societies have found to be a decent balance between scrapers' rights and internet users' rights. Your solution lacks this due consideration and gives internet users a giant middle finger.
In your last paragraph it is pretty clear you are doing this because of some ill-conceived "ethical" notion that "because HTTP responded with this payload, it is now mine with an 'ethical license' to do anything". There are other ways to point out security flaws in ActivityPub that are way more constructive and less asshole-ish, but it seems you're pretty keen to erase a lot of moral and legal nuance to prove "because I have a technological capability means I have the moral ought and the legal right". Sorry, but no: the world is a lot more complex than this.
Just because I have the technological capability to transmit the message "you're being a dick" from the comfort of my home doesn't automatically mean it would be ethical for me to, so of course I am not going to tell you "you're being a dick", and normally I wouldn't type this sentence at all but in this special case I am because it shouldn't be a problem with your ethical system since I'm not actually saying it despite having the technological capability, so it should have no impact on you (and if it did, it should give you pause to reconsider that maybe you need to do more self-reflection on discovering your actual reasons for doing this ill-advised project).