Hacker News new | ask | show | jobs
by jraines 6647 days ago
I use Ruby, with its nice regex support and libraries (open-uri, REXML) and the hpricot and mechanize rubygems.

Yahoo Pipes is also fun to play with; and Firebug is the scraper's best friend.

Right now I'm working on scraping public LinkedIn data. In the past I've done Craigslist and Twitter. I haven't done anything really hard, though -- mostly things that can be read as XML.

Here's a few cool links if you're interested in scraping with Ruby: http://del.icio.us/jeremyraines/scraping