Hacker News new | ask | show | jobs
by mcovey 1740 days ago
I have been using hxselect from the html-xml-utils package to do this for many, many years.

It doesn't handle malformed HTML that well but can be coaxed into working about 90% of the time, with the help of the other included package hxclean or something like html-tidy.