|
|
|
|
|
by pedrovhb
1493 days ago
|
|
Well, yes - he's saying "regex is not appropriate for parsing html", and I'm saying "regex is faster than parsing html" - they're not contradictory statements, and both are true :) To be clear, I'm not talking about building a syntax tree or a way to generically extract elements based on a CSS path selector. I'm saying if you're only interested in a couple of data points in a 3 MB HTML document, and you're sure they're always between some other specific text or even tags, then it's more efficient to use a simple regex than it is to parse the entire thing, which is computationally expensive when running over a large number of large files. |
|