|
|
|
|
|
by voxic11
464 days ago
|
|
This is the article I link my developers whenever I see them making this mistake "Don’t try to sanitize input. Escape output." https://benhoyt.com/writings/dont-sanitize-do-escape/ Its been fairly effective at making them realize the fundamental mistake they are making. Quoting the key part: > The only code that knows what characters are dangerous is the code that’s outputting in a given context. > So the better approach is to store whatever name the user enters verbatim, and then have the template system HTML-escape when outputting HTML, or properly escape JSON when outputting JSON and JavaScript. > And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL |
|
Or, stop using stringly template systems, and treat the data as what it is: a structured language, with well-defined grammar.
One of these days I need to write an article titled "Don't play with escaping strings. Serialize output.". Core idea being, "escaping your output" still looks too much like "sanitizing input"[0], and one tiny mistake is all it takes to give an attacker ability to inject arbitrary code into the page (or give an unlucky user ability to brick the page for themselves) - so instead of working in "string space", work in whatever semantics your output is, and treat the string form as a serialization problem. In case of HTML, that means constructing tree of tags as data structure, and then serializing them. Then, bugs in serializer notwithstanding, the whole class of injection problem disappears - you can't do "<h1>$text</h1>" -> "<h1></h1><script .... </h1>", when your "template" is made of data structures like [:h1, $text], because $text can't possibly alter the structure here. Etc.
In some sense, "Don't escape, serialize instead" is the complement of "Parse, don't validate".
(See also: make invalid states unrepresentable.)
--
[0] - Who ever sanitizes input? I've only ever seen this kind of sanitization the article describes happen in the output, within string-gluing templates.