Hacker News new | ask | show | jobs
by johnb 6645 days ago
I'm a big fan of using Hpricot + Ruby. I'd say the sites I had been scraping but I doubt my old client wants it to come out :|

To get the most bang for my buck (developer time wise) I would visit each site with firebug in inspect mode, hover the data I want to extract. From there I figure out how I would style that element, and because Hpricot supports CSS selectors I've straight away got a method for pulling that data out of the page.