Y
Hacker News
new
|
ask
|
show
|
jobs
by
mark_l_watson
4695 days ago
Check out the Common Crawl contest winning projects from the linked page - some very good work, and a good source of ideas and techniques:
http://commoncrawl.org/the-winners-of-the-norvig-web-data-sc...
Some good stuff!
1 comments
wicknicks
4695 days ago
I loved the inter-lingual web page linkage visualization project. Any idea why Traitor won the contest? It seems very similar to regular "create inverted index with map reduce" problem, or am I missing something?
link
mark_l_watson
4694 days ago
Perhaps Traitor won because it is such a good example of using Map Reduce over the Common Crawl data? I agree that inter-lingual was a cool project.
link