Usually as a starting point for a crawler when checking statistics about the web.
Crawling is much faster if you don't have to "spider" all the links and check if you've already visited them or not.
With a big enough list, you can just iterate over those domains. (average number of links on a website, how often does javascript framework x vs y get used, how many sites have an HTML5 doctype yet, ...)
Crawling is much faster if you don't have to "spider" all the links and check if you've already visited them or not.
With a big enough list, you can just iterate over those domains. (average number of links on a website, how often does javascript framework x vs y get used, how many sites have an HTML5 doctype yet, ...)