Hacker News new | ask | show | jobs
by monkin 2296 days ago
I like the idea behind fediverse, nevertheless it's mostly suitable for HN users or being a replacement for phpBB forums in small communities.

> Well, it's fun to browse but come on, imagine a timeline of all tweets on Twitter: mostly useless.

BTW. This will happen too, it's just not popular enough to catch that.

1 comments

I started that project, but haven’t had time yet to finish it:

https://git.eeqj.de/sneak/feta

Careful with this. A lot of communities don't like scraping of their public content. There was a guy who got booted from archive.org I think for trying to archive an instance that had a lot of under-18 folks' content.

I'd encourage you to build a federated app instead. :)

I think you are making the assumption that I am scraping content based on the fact that I am developing a free software content scraper that anyone is invited to use.
Yep, just like I assume countries that enrich plutonium to weapons grade levels and stick it on the pointy end of an ICBM are threatening others with nuclear weapons, even if they're never launched.
If you think developing web spider software is akin to developing nuclear weapons, I think you might want to go have a talk with some larger, well-known companies who have not only half-developed not-yet-working software (like my activitypub spider, which doesn't even have a storage backend at the moment), but who have fully developed advanced web spiders that have actually downloaded and archived exabytes of data from the web, to be saved privately for all time. Frequently they even let anyone who wants search the full text of it, usually without authentication!

If you don't want second parties to have copies of your data, configure your webserver not to send it to them when they request it. You can't force someone to do something with an HTTP request.

Your first statement looks like it should be logical, but when read for soundness, the consequent ("[then] I think you might...") makes absolutely no sense following the antecedent ("if you think..."). I only mentioned nuclear weapons to try to really emphasize to you that a technology's existence is enough to cause fear in people and communities, which does have real world consequences. But I don't think you care about that.

Anyway, I work at one of those companies. You know what they have? Ways to let users opt out (ex: ROBOTS.txt), ways to ensure they're not DOSing people when scraping (which uses material resources: compute time, spindles, electricity, etc), ways to track the copyright of the source material (which belongs to the author, usually), and ways to respond to second-party requests (legal and non-legal notices) who want to know how much of their data has been scraped or exercise their rights over their material. These technological features are because this is what human societies have found to be a decent balance between scrapers' rights and internet users' rights. Your solution lacks this due consideration and gives internet users a giant middle finger.

In your last paragraph it is pretty clear you are doing this because of some ill-conceived "ethical" notion that "because HTTP responded with this payload, it is now mine with an 'ethical license' to do anything". There are other ways to point out security flaws in ActivityPub that are way more constructive and less asshole-ish, but it seems you're pretty keen to erase a lot of moral and legal nuance to prove "because I have a technological capability means I have the moral ought and the legal right". Sorry, but no: the world is a lot more complex than this.

Just because I have the technological capability to transmit the message "you're being a dick" from the comfort of my home doesn't automatically mean it would be ethical for me to, so of course I am not going to tell you "you're being a dick", and normally I wouldn't type this sentence at all but in this special case I am because it shouldn't be a problem with your ethical system since I'm not actually saying it despite having the technological capability, so it should have no impact on you (and if it did, it should give you pause to reconsider that maybe you need to do more self-reflection on discovering your actual reasons for doing this ill-advised project).

I'm going to be blunt: your ethics statement sucks. It reeks of "I don't care what you intended, I'm going to use your data in ways you didn't want because nothing is physically stopping me". At the very least, that's a terrible attitude to describe as an "ethics statement". If you were to call it "justification" instead, at least it would be internally consistent.

I see that your code makes no mention of robots.txt, so you've designed it in such a way that explicitly ignores each instances' published intention. You can't reasonably make any claims about "consent" while pretending that "User-agent: *; Disallow: /" isn't there.

From a first glance it does not scrap the web UI, and uses public APIs only. So mentioning that it ignores robots.txt is not a solid argument. These APIs are there specifically for automated use.

I agree that this "ethics statement" is of no use, though. The author should have ignored these people who get upset because of their posts being copied.

Every time people get upset because of reasonable behavior of others and unreasonably attempt to control that behavior, it is an opportunity for teaching.
You use this phrase:

> ... reasonable behavior of others ...

According to whom? Every dictator thinks it's reasonable behaviour for them to crush the opposition, while those who look on, or those who suffer, will usually believe that to be unreasonable behaviour.

Someone I learned from a colleague once is this:

"No one ever thinks they are the bad guy."

So your concept or reasonable behaviour may not match mine, and you may exhibit behaviour that will upset me. That's not an opportunity for me to learn, that's an opportunity for me to seek some sort of recourse against you.

Don't be surprised if others attempt that.

Publishing software is protected expression in the place I am writing it, so I will absolutely be surprised if others attempt that: it would be illegal under the laws in this place.