Hacker News new | ask | show | jobs
by s1lv3rj1nx 787 days ago
It can scrape linked pages too by defining the depth but make sure the depth parameter is not too much else it will consume too much memory and time.
1 comments

Playing around with the UI, I cannot see where that depth would be set. Is it not a per-datasource variable?

Is the "scrape linked pages" configured to be "sandboxed" within a url hierarchy (so adding example.com/foo/ would add all linked pages that are also under example.com/foo/) or not (so it would also include linked pages to other domains or subfolders)?