Hacker News new | ask | show | jobs
by _bxg1 2651 days ago
Headless Chrome is available and widely-used, and commonly you can get around the JS thing by simply waiting a few seconds before scraping. I'd assume the crawling itself isn't the hard part (aside from maybe just the raw compute time it takes).