Hacker News new | ask | show | jobs
by senkora 1180 days ago
Regular expressions can only parse regular languages. More powerful parser techniques can parse context-free languages or even recursively enumerable languages. These are fundamentally distinct levels of complexity.

It is fundamentally impossible to correctly parse e.g. HTML with regular expressions. See: https://stackoverflow.com/a/1732454

1 comments

Ah yes, the classic StackOverflow answer. If that doesn't teach you not to parse HTML with regex, nothing will...