| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by adamdoupe 3285 days ago

Context here means the context of the output page.

Usually this means the HTML context. Different sanitization is needed depending on _where_ in the HTML document the input is used.

For instance, if the input is used in between HTML tags (let's say $foo is user input in this PHP example):

    ... <body><?php echo $foo ?></body>

Here, the input that you need to transition to JavaScript execution is a < character (among other things): <script>alert(1)</script>.

Therefore, to correctly sanitize this, you would call the PHP `htmlentities` function:

    ... <body><?php echo htmlentities($foo) ?></body>

Now, this XSS vulnerability is fixed.

What if foo is used in a different context?

    ... <body><a href='<?php echo htmlentities($foo) ?>'>...

Here, what we need to transition the HTML parser to executing JavaScript is a ' character, and this can be exploited by the following input (in between the double quotes): "' onclick='alert(1)"

The key problem is that `htmlentities` is not valid sanitization in the context of an HTML attribute value. In this example, you need to use `urlencode`

    ... <body><a href='<?php echo urlencode($foo) ?>'>...

The general idea also applies to CSS, JSON, and JavaScript. SQL is a different vulnerability class (SQL injection).

I highly recommend the following research paper from 2011 that discusses the context-sensitivity of JavaScript in depth: http://www.comp.nus.edu.sg/~prateeks/papers/scriptgard-ccs11...

In my mind, the context-sensitivity of XSS is one of the key reasons why it is so prevalent.