Hacker News new | ask | show | jobs
by lukan 207 days ago
It does also give the right solution, using an xml parser.
1 comments

We don’t know the use case.

Maybe the questioner is also in full control of the HTML creation and they don’t need a parser for all possible HTML edge cases.

Maybe they are, but they would also need to ensure a well-defined subset of HTML and also show that the subset is a reglar (Chomsky Type 3) grammar.

It seems that even the very conceptually simple example given by the questioner is impossible.