| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by brodouevencode 1473 days ago
	bs4 introduced some very nice features over bs3, if that's what you were using, and includes the ability to use libxml2 as a parser. For very simple things though libxml2 would be a better fit.

1 comments

jamessb 1473 days ago

bs4 is able to parse some malformed documents that libxml2 chokes on.

For these cases it can be useful to do the reverse, and use the BeautifulSoup HTML parser as an alternative parser backend for the lxml package: https://lxml.de/elementsoup.html

link