Hacker News new | ask | show | jobs
by dorkwood 809 days ago
I did a bit of data scraping for fun in the past, but I was never quite sure of the legality of what I was doing. What if I was breaking some law in some jurisdiction of some country? Was someone going to track me down and punish me?

OpenAI has taught me that no one gives a shit. Scrape the entire internet if you want, and use the data for whatever you feel like.

4 comments

We were really heading someplace with The Semantic Web aka The Real Web 3.0 [1]

Alas we have to fight against the machines in order to properly read the internet thru machines.

I believe Discourse knowingly keeps its data easy to scrape though, so kudos to them!

[1]: https://en.wikipedia.org/wiki/Semantic_Web

> OpenAI has taught me that no one gives a shit. Scrape the entire internet if you want, and use the data for whatever you feel like.

Cloudflare gives a shit.

My household had to use our 5G internet for most things for a week or two until our IP reputation recovered.

Yeah it’s probably worth renting a server if there’s any doubt about whether it’s wholly appropriate to do something
Some sites just block the entire AWS/GCP ip address range.
Do you think it would be better if someone did track you down and punish you? Which world do you want to live in?
I think large companies should be punished for stealing from people to make themselves richer.
A precursor to this would have been that Linkedin lawsuit Microsoft lost, allowing that one company to scrape all of Linkedin (technically "public information").
hiQ Labs v. LinkedIn