|
|
|
|
|
by blister
1041 days ago
|
|
Hah, I literally just fought this for the past month. We run a large esports league that relies on player ranked data. They have the data, and as mentioned above, they send it down to the browser in beautiful JSON objects. But they're sitting behind Cloudflare and aggressively blocking attempts to fetch data programmatically, which is a huge problem for us with 6000+ players worth of data to fetch multiple times every 3 months. So... I built a Chrome Extension to grab the data at a speed that is usually under their detection rate. Basically created a distributed scraper and passed it out to as many people in the league as I could. For big jobs when we want to do giant batches, it was a simple matter of doing the pulls and when we start getting 429 errors (rate limit blocking code they use), switch to a new IP on the VPN. The only way they can block us now is if they stop having a website. Give one of the commercial VPN providers a try. They're usually pretty cheap and have tons of IPs all over the place. Adding a "VPN Disconnect / Reconnect" step to the process only added about 10 seconds per request every so often. |
|