Hacker News new | ask | show | jobs
by jng 1472 days ago
We used this in a project many suns ago and we ended up switching to libxml2, less pretty presentation, but more functional. YMMV.
1 comments

bs4 introduced some very nice features over bs3, if that's what you were using, and includes the ability to use libxml2 as a parser. For very simple things though libxml2 would be a better fit.
bs4 is able to parse some malformed documents that libxml2 chokes on.

For these cases it can be useful to do the reverse, and use the BeautifulSoup HTML parser as an alternative parser backend for the lxml package: https://lxml.de/elementsoup.html