Is this truly crawling the sites, or querying each site's search service?
e.g. for Twitter, are you crawling all the Twitter users' URLs (seems difficult to do - how do you find new accounts' URLs?), or are you fetching results from search.twitter.com ?
e.g. for Twitter, are you crawling all the Twitter users' URLs (seems difficult to do - how do you find new accounts' URLs?), or are you fetching results from search.twitter.com ?