| Cloudflare seems to have a 4th: 4) Provide a challenge in the return response that is impossible for anyone to complete One way to see this one is to use Selenium to launch your browser. E.g., run this code in Python: from selenium import webdriver browser = webdriver.Chrome() then when the browser launches start using it manually to surf the web [1]. This works great on most sites I've visited this way, including my financial institutions. But if it hits a Cloudflare CAPTCHA it fails. For example try this on fanfiction.net. It hits the browser check page if I try to go to any category or story page. I click the checkbox to tell it I'm real, get the challenge to identify the lions or whatever, do that until it is satisfied I really can identify lions...and then just goes back to the browser check page. As far as I can tell it is just an endless loop of check the box and identify the things at that point. There are some settings you can do in Selenium to tell it to to somewhat hide from the site that Selenium is involved, which for a while allowed getting past the CAPTCHA but that stopped working after a while. There's also a project somewhere on Github to make a Selenium Chrome driver specifically designed to not trigger bot detection, which also worked for a while and then stopped. [1] Why would I want a Selenium-launched browser if I'm going to be using it manually? It's for sites where I want to do some automated things on just some pages. For example one of my financial institutions has a lot of options on their transaction download page, so after I finish manually doing things like checking balances, looking at recent activity, paying bills and want to finish by downloading transactions, I can have the script that launched the browser handle that. |