Hacker News new | ask | show | jobs
by blenderob 20 days ago
I totally understand the general advice of using actual XML DOM library for making DOM. But for my own understanding, I want to ask why the 5 escapes the OP suggests (&, <, >, " and ') aren't good enough? Do you see anyway to exploit it if these 5 are escaped? Someone kind enough to enlighten me?
1 comments

They are:

> The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively. The right angle bracket (>) may be represented using the string " &gt; ", and MUST, for compatibility, be escaped using either " &gt; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

> In the content of elements, character data is any string of characters which does not contain the start-delimiter of any markup and does not include the CDATA-section-close delimiter, " ]]> ". In a CDATA section, character data is any string of characters not including the CDATA-section-close delimiter, " ]]> ".

> To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " &apos; ", and the double-quote character (") as " &quot; ".

https://www.w3.org/TR/xml/#syntax