Hacker News new | ask | show | jobs
by mark_l_watson 4695 days ago
Check out the Common Crawl contest winning projects from the linked page - some very good work, and a good source of ideas and techniques: http://commoncrawl.org/the-winners-of-the-norvig-web-data-sc...

Some good stuff!

1 comments

I loved the inter-lingual web page linkage visualization project. Any idea why Traitor won the contest? It seems very similar to regular "create inverted index with map reduce" problem, or am I missing something?
Perhaps Traitor won because it is such a good example of using Map Reduce over the Common Crawl data? I agree that inter-lingual was a cool project.