Hacker News new | ask | show | jobs
by stoneridge 3144 days ago
Haven't tried this[0] yet, but Scrapy should be able to handle JavaScript sites with the JavaScript rendering service Splash[1]. scrapy-splash[2] is the plugin to integrate Scrapy and Splash.

[0] https://blog.scrapinghub.com/2015/03/02/handling-javascript-...

[1] https://splash.readthedocs.io/en/stable/index.html

[2] https://github.com/scrapy-plugins/scrapy-splash

2 comments

HTMLUnit in Java is a good browser emulator and can be used to work JavaScript-heavy web sites, form submission, etc.
Reading this from my phone looked like you meant there was a web scraping tool actually called “this[0]” which would be a cracking name.