| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CHsurfer 4463 days ago
	At first, this seems correct. It's definately easier to get scraping with something like Capybara and a suitable js enabled driver, but in my experience, this solution is less reliable. Async loaded data can time out and don't get me started on the difficulties of running the scraper with cron jobs. In the end, I migrated even my JS heavy pages to Mechanize based solutions. It takes a few extra requests to get the async data, but once you get that figured out, it's rock solid - till they update the site design ;-)