Hacker News new | ask | show | jobs
by einhverfr 5012 days ago
I think we should create an app called clscraper which does as follows:

Given a set of resources, crawls across the site, extracting specific information from listings (price, number of bedrooms, number of bathrooms, square ft etc, contact info). Puts them in a very simple database. This should be 100% trouble-free copyright-wise at least in the US. The larger issue becomes what happens under other laws. However building a generic tool to extract (non-copyright-worthy!) facts from ads should by itself be more or less trouble-free.

Make this generic enough to work on most ad sites out there. Push the boundary back.