Hacker News new | ask | show | jobs
by zalebz 1594 days ago
It potentially could be related to all of the "knock-off" websites that scrape StackExchange data. Maybe they are going outbound on various Tor nodes and getting the IPs blacklisted as a result of reading thousands of pages too rapidly.
1 comments

Given my experience with the network quality of Tor, I'd be surprised if scraping was A) efficient to do over Tor and B) that Stack Overflow would even notice it because as I said, the network speed is too slow, so can't add that much traffic compared to the absolutely staggering amount of traffic they get from non-Tor.
Also, what would be the point? It's easier to download a data dump from https://archive.org/details/stackexchange
You would be surprised/disappointed at the amount of abuse the bigger sites have to handle

Things like this https://news.ycombinator.com/item?id=26072025

Interesting, but doesn't fit the context of Stack Exchange blocking Tor. Your example there is regarding a mobile app hotlinking a image of a flower, which seems easy enough to block/fix, while Stack Exchange blocking all Tor users from even reading Stack Overflow doesn't make so much sense.
1 image abused enough, they dig, find the culprit being an app.

Fixes, updating the app, blocking the image, blocking all requests with empty user agents.

------

1 person abused enough SO, they dig, find the culprit being someone using Tor network.

Fixes, identify the user and ask them to stop, block all traffic from Tor.

-------

Do you propose an alternative fix?

I wasn't aware that existed and obviously that would make any scraping utterly useless