Hacker News new | ask | show | jobs
by mholt 4679 days ago
This is cool... if the content is structured. (Ever tried finding addresses in arbitrary text? Much harder: http://smartystreets.com/products/liveaddress-api/extract)
1 comments

Come on, that's not really a scraping problem, it's more of a text parsing problem coupled with an API lookup or scrape to verify the address.

Though, id probably just google for some good address regexes, match against pages, for each address just throw them into something like maps.google.com/?q=[address] then try to scrape whatever normally pops up for a valid result. Also helps if you're expecting addresses to be in a certain country.