|
|
|
|
|
by qeorge
6022 days ago
|
|
May I suggest taking a look at Parsely? Its the syntax they use on www.parselets.com. The documentation for implementing it in your own apps is a little sparse, but the data format is awesome. Here's one that describes scraping HN: http://parselets.com/parselets/yc/14 Might not be a fit for your project, but in terms of describing parsing instructions to a crawler its the best format I've ever seen. |
|