|
|
|
|
|
by arkitaip
5655 days ago
|
|
Very timely and interesting. I am currently looking for a crawler that tightly integrated with Drupal and that can be easily managed through Drupal nodes. Any suggestions on a solution for a small site that only needs to handle thousands of pages/urls? |
|
For regular crawling:
I found anemone ( http://anemone.rubyforge.org/ ) to be a lovely framework for single page crawls.
Other interesting candidates:
https://github.com/hasmanydevelopers/RDaneel
http://www.redaelli.org/matteo-blog/projects/ebot/
http://nutch.apache.org/ (meh, java)