|
|
|
|
|
by megaduck
5830 days ago
|
|
The language of individual HTML tags is certainly regular, and trivially easy. However, the language of "matched HTML tags with junk between them" is NOT regular. Anything that requires balanced matching is NOT parseable with standard Regular Expressions, and by not parseable I mean that you will literally have an infinite amount of bugs. Shoot me an email and I can show you the math. Even with Perl's whiz-bang recursive not-really-regexes-regexes, it's strongly not recommended to tackle balanced matching problems like HTML or XML. It might be theoretically possible (I haven't actually checked), but your brain will leak from your ears and you probably won't get it right, no matter how smart you are. |
|