Hacker News new | ask | show | jobs
by shavenwarthog2 4444 days ago
Python :-) there are libraries specifically for parsing malformed html. I'm happy using Unix tools for scraping and parsing, but you run into a brick wall rather quickly. Python is more reliable, flexible, and easier to integrate.