Hacker News new | ask | show | jobs
by marginalia_nu 438 days ago
I think most crawlers inevitably tend to turn into spaghetti code because of the number of weird corner cases you need to deal with.

Crawlers are also incredibly difficult to test in a comprehensive way. No matter what test scenarios you come up with, there's a hundred more weird cases in the wild. (e.g. there's a world's difference between a server taking a long time to respond to a request, and a server sending headers quickly but taking a long time to send the body)

1 comments

I thrive for these kinds of moving-target challenges. But nobody will hire.