Hacker News new | ask | show | jobs
by hashmush 1493 days ago
There's a big difference between parsing HTML and

> using regex to parse data when the data you're scraping has a constant enough structure

Regex is fine, just don't parse the HTML itself.

1 comments

What percentage of web scraper routines resort to regex when they should at least start with xpath or some equivalent parser?