Hacker News new | ask | show | jobs
by brodouevencode 1473 days ago
bs4 introduced some very nice features over bs3, if that's what you were using, and includes the ability to use libxml2 as a parser. For very simple things though libxml2 would be a better fit.
1 comments

bs4 is able to parse some malformed documents that libxml2 chokes on.

For these cases it can be useful to do the reverse, and use the BeautifulSoup HTML parser as an alternative parser backend for the lxml package: https://lxml.de/elementsoup.html