|
|
|
|
|
by EmilStenstrom
187 days ago
|
|
It was usable! Yeah, the handler based architecture that I had built on was very dependent on object lookups and method calls, and my hunch was that I had hit a wall trying to optimize the speed. I was slower than html5lib still, so decided to go with another "code architecture" (html5ever) that was closer to the metal. Worked out in getting me ~60% faster than html5lib. As for bs4, if you don't change the default, you get the stdlib html.parser, which doesn't implement html5. Only works for valid HTML. |
|