Hacker News new | ask | show | jobs
by dejobaan 412 days ago
> Feel free to submit a PR!

I think it's cool that you're experimenting in this area, but I'm not a huge fan of this as answer to a question about responsible/respectful web crawling. This stuff seems like it should be table stakes (even if you wanted to make it optional for the end user), but "yeah probably; learn the codebase, fork it, make changes, then we'll review it" really puts the onus onto the original poster.

2 comments

Ah you're right, my bad. Hope I didn't sound dismissive because I think some sort of robots.txt needs to exist for AI that's scraping the web both at train or test time.

I'm really not excited at all about the "scrape other people's data" use case for BLAST and if we can prevent it then awesome. I'm excited about BLAST automating science, legacy web apps, internal tools, adding AI automation to your own app, etc.

Curious: if a user has an ad blocker, are they browsing responsibly?
Very. Malvertising is a thing. Adtech surveillance is a thing.

Ad blocker is the least user can do.

That's not why most users use ad blockers.