Hacker News new | ask | show | jobs
by cmjqol 2901 days ago
I always assumed Web Scraping wasn't something particularly challenging because of how many libraries existed for this purpose.

This article made me realize I assumed wrong.

1 comments

It gets especially difficult with dynamic content or when trying to scrape sites written on very heavy frameworks like ASP.NET Webforms that require passing the view state with every request. I made a calendar aggregator for adult hockey times in my area that scrapes rink websites[0] and it was far more difficult than I had thought it would be because of the fact that the rinks all used Telerik Webforms controls to do their calendars. It turned a 30 minute job into a 2 hour job.

[0]: http://dpscschedule.azurewebsites.net/