|
|
|
|
|
by unlinkr
2473 days ago
|
|
The problem with the answer is it is wrong. The question is about identifying start-tags in XHTML. This is a question of tokenization and can be solved with a regular expression. Indeed, most parsers use regular expressions for the tokenization stage. It is exactly the right tool for the job! Furthermore, the asker specifically needs to distinguish between start tags and self-closing start tags. This is a token-level difference which is typically not exposed by XHTML parsers. So saying "use a parser" is less than helpful. I have elaborated a bit in blog post: https://www.cargocultcode.com/solving-the-zalgo-regex/ |
|