Hacker News new | ask | show | jobs
by Jaruzel 3708 days ago
Hmm, I wondered a while back if anyone was keeping track of these. It seems to be a manual list so far, so added my main server, as that's been transmitting the clacks overhead since he died. Can't wait for the crawler to go live and then we'll see how many people did add it, including any big name sites.
1 comments

Hi Jaruzel, the crawler is live :D you can see in the bottom which page it is crawling right now.

It's basically scraping all links on every page it hits and tests the headers if they are containing the clacks value.

I added the form so people can submit pages to speed up the development of the list, even though I believe eventually the crawler would get to their pages :D

Cool! What software are you using for the crawler?
I built a not very advanced one with node.js and mongodb :)

Mainly in use in the crawler: * request * cheerio