Hacker News new | ask | show | jobs
by jamessb 1472 days ago
bs4 is able to parse some malformed documents that libxml2 chokes on.

For these cases it can be useful to do the reverse, and use the BeautifulSoup HTML parser as an alternative parser backend for the lxml package: https://lxml.de/elementsoup.html