Hacker News new | ask | show | jobs
by throwaway6845 3299 days ago
We're not talking generic "bots".

We're talking a custom scraper written for this site and this site only.

Yes, I am expecting the people who spend hours inspecting the source of my site, and then writing a custom scraper for it, to spend 30 seconds reading the T&Cs first.

1 comments

Not sure why you'd expect that. If my webbrowser can download your source code, my software will as well.

If you want people to read it put your content behind a sign up with a checkbox.

It is _already_ behind a sign-up with a checkbox. They scraped their way past that too.
How? (Seriously, how does one do this?)
Simply log in first, then perform the scrape programmatically.

Seen here: https://kazuar.github.io/scraping-tutorial/

oh... I thought they were able to circumvent logging in and could scrape directly... hat makes much more sense now, thank you...
Ah, that changes things somewhat.