| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pedrovhb 1539 days ago
	Well, yes - he's saying "regex is not appropriate for parsing html", and I'm saying "regex is faster than parsing html" - they're not contradictory statements, and both are true :) To be clear, I'm not talking about building a syntax tree or a way to generically extract elements based on a CSS path selector. I'm saying if you're only interested in a couple of data points in a 3 MB HTML document, and you're sure they're always between some other specific text or even tags, then it's more efficient to use a simple regex than it is to parse the entire thing, which is computationally expensive when running over a large number of large files.