|
|
|
|
|
by Seirdy
1556 days ago
|
|
I've heard from other people who run engines (Right Dao, Gigablast) that this is a major problem; Common Crawl does look helpful, but it's not continuously updated. FWIW, Right Dao uses Wikipedia as a starting point for crawling. Kiwix makes pre-indexed dumps of Wikipedia, StackExchange, and other sites available. Some sort of partnership between crawlers could go a long way. Have you considered contributing content back towards the Common Crawl? |
|