Hacker News new | ask | show | jobs
by kurtsiegfried 4508 days ago
Nutch/Solr could provide a way to do a crawl, refine parameters, and then feed into a tool to download the actual resources.