Hacker News new | ask | show | jobs
by mipmap04 2901 days ago
It gets especially difficult with dynamic content or when trying to scrape sites written on very heavy frameworks like ASP.NET Webforms that require passing the view state with every request. I made a calendar aggregator for adult hockey times in my area that scrapes rink websites[0] and it was far more difficult than I had thought it would be because of the fact that the rinks all used Telerik Webforms controls to do their calendars. It turned a 30 minute job into a 2 hour job.

[0]: http://dpscschedule.azurewebsites.net/