| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boomzilla 4483 days ago
	I've been mulling over this idea for a while now and actually have implemented a prototype. The current roadblock I run into, as someone said earlier in the thread, is the data sources. I've implemented a crawler that pulls from roughly 10 different sites, but clearly the custom crawling doesn't scale: there are just too many websites out there, and I don't have access to the crawling infrastructure of the Google/Bing size (both machine power and development effort). I am thinking of making my prototype more human involved. For example, users (like you) could list the sites they often check for events, and maybe give the system some cue on how to extract the events from those sites. Another approach is MTurk which is something I'll try next.