Hacker News new | ask | show | jobs
by bluecat22 3002 days ago
Look into commoncrawl.org which provides a free web index which you can query against. Now that cloud is available, you could in theory download the index and load it into Google's big query or AWS and run your experiments.