|
> In practice the type that is "passed around" is almost always just "string" That's what happens in practice, of course. The GP was proposing something else, and I was explaining how complicated that gets. > and this is converted at the last moment to a single destination format, such as HtmlString. I explained before why this doesn't work unless we're talking about the final destination of this string. Otherwise, if that string is being taken through various encodings (say user input to JSON to sprintf format string to HTTP body), and if you need to combine safe and unsafe input, then what you're saying doesn't work anymore. Here is a sketch of an example: userInput := read()
jsonFormat := "{\"context\": \"%s\", \"input\": \"" + json.escape(userInput) + "\"}" //easy and safe
finalJson := ""
sprintf(finalJson, jsonFormat, "some context")
// oops - unsafe if original input was "%s"
sprintf(finalJson, printf.escape(jsonFormat), "some context")
// oops - does the wrong thing - it will output "{\"context\": \"%s\", \"input\": \"%s"\"}
//let's try the other way around?
userInput := read()
formatStr := "{\"context\": \"%s\", \"input\": \"" + printf.escape(userInput) + "\"}" //easy and safe
finalJsonStr := ""
sprintf(finalJson, formatStr, "some context")
// oops - unsafe if userInput was "safe-looking\", \"bypassAuth\": \"true\"}"
sprintf(finalJson, json.escape(formatStr), "some context")
// oops - does the wrong thing - it will output
// "\"{\\\"context\\\":\\\"some context\\\", \\\"input\\\": \\\"safe-looking\\\", \\\"bypassAuth\\\": \\\"true\\\"}\"" - that is, a JSON string instead of a JSON object
The only solution to get this to work is to keep the user input string entirely separate from any other string, and apply escaping to it individually at every level where it is used.Additionally, you will need to remember what escaping has been applied to it, and in what order, so that it can be un-escaped back to the original value when needed. |
This is how you put the safety on and return the gun into its holster:
Test here: https://dotnetfiddle.net/p8P1fOFundamentally, putting any format like JSON or any user-controlled input into the first parameter of sprintf or any similar function in any language is Wrong with a capital W. It ought to be picked up in code review.
Ideally, sprintf-like functions in strongly typed languages should use a special "FormatString" type instead of a plain string as the first input. This would automatically fix any such issues, but relying on this is still problematic. Naively printing potentially malicious input to places like the console is still quite dangerous, no matter how much you escape it! Logs can be captured into systems that then paste it directly into HTML. Similarly, console control codes can be used by attackers as a nuisance. Etc... Structured logging, along the lines of OpenTelemetry is safer.
See: https://owasp.org/www-community/attacks/Log_Injection
This is the safe equivalent of your second example. Both format strings and JSON are correctly handled:
This outputs: Link: https://dotnetfiddle.net/Lm8jkR