Hacker News new | ask | show | jobs
by qeorge 6022 days ago
May I suggest taking a look at Parsely? Its the syntax they use on www.parselets.com. The documentation for implementing it in your own apps is a little sparse, but the data format is awesome. Here's one that describes scraping HN:

http://parselets.com/parselets/yc/14

Might not be a fit for your project, but in terms of describing parsing instructions to a crawler its the best format I've ever seen.

1 comments

I'm not crawling, but that is pretty interesting looking. I'll bookmark it and take a look at it for later for sure - thanks!