Hacker News new | ask | show | jobs
by wusatiuk 3509 days ago
what is your current stack, hosting and crawling infrastructure? how many websites have you crawled so far?
1 comments

PHP and Python. Current count - 902873

I have also scrapped all the websites from Quantcast but haven't scanned it yet.

The website is build using PHP including the crawler which works for the new websites not in the list.

For scanning multiple websites, the crawler is in Python which uses multiprocessing.