Hacker News new | ask | show | jobs
by adolph 1 day ago
> scrape pricing data for used cars

Time was you could get lovely json feeds from every site by iterating the inspector curl statement. Now-a-days you can't even use Selenium without Cloudflare getting grouchy. Last fall had to make my spreadsheet like a cave-person control c, control v. It wouldn't be so bad if the dealer aggregators' coverage was xor, but you have to dedupe listings. Then there is the whole online salespeople who don't show up at the dealership.

1 comments

There's a JavaScript property called navigator.webdriver that returns true if selenium is in use. Obviously, every antibot system checks it. Obviously, you can patch it to always say false.