Hacker News new | ask | show | jobs
by volokoumphetico 4665 days ago
very cool, is it doing a depth first blind crawl of any domain you throw at it?
1 comments

It will basically go through all the links it finds, that can be millions of links. You can also tell it to ignore certain links using regular expressions or via tales java apis.
Here is a sample code of a 1 depth scrape:

https://github.com/calufa/tales-templates/blob/master/core/s...

This is the api call to start the scraper on twitter:

http://localhost:8080/start?process=tales.scrapers.LoopScrap... -template tales.templates.FirstDepthTemplate -threads 2 -namespace com_twitter -baseURL twitter.com