| You're basically running around with your finger on the trigger and suggesting that everyone everywhere ought to wear ballistic armour to compensate. This is how you put the safety on and return the gun into its holster: using System;
using System.Text.Json;
string maliciousInput = "{0} % $0 -- DROP TABLE \"USERS\"";
// Always, always, ALWAYS use a proper serializer for assembling formats like JSON.
// The malicious input can include actual JavaScript, and it'll be correctly encoded with 100% safety.
string encoded = JsonSerializer.Serialize( new {
context= "{0}", // .NET format string placeholder
input=maliciousInput
});
// This will just work, formatting placeholders are ignored if no parameters are specified
Console.WriteLine(encoded);
// A safe FormatException is thrown if you mis-use the string formmating code.
// No vulnerability other than DDoS.
Console.WriteLine(encoded, "adfasfd");
Test here: https://dotnetfiddle.net/p8P1fOFundamentally, putting any format like JSON or any user-controlled input into the first parameter of sprintf or any similar function in any language is Wrong with a capital W. It ought to be picked up in code review. Ideally, sprintf-like functions in strongly typed languages should use a special "FormatString" type instead of a plain string as the first input. This would automatically fix any such issues, but relying on this is still problematic. Naively printing potentially malicious input to places like the console is still quite dangerous, no matter how much you escape it! Logs can be captured into systems that then paste it directly into HTML. Similarly, console control codes can be used by attackers as a nuisance. Etc... Structured logging, along the lines of OpenTelemetry is safer. See: https://owasp.org/www-community/attacks/Log_Injection This is the safe equivalent of your second example. Both format strings and JSON are correctly handled: Console.WriteLine( "{0}", JsonSerializer.Serialize( new {
context="{0}", // if sprintf/WriteLine is not misused delibaretely, this is safe!
input="safe-looking\", \"bypassAuth\": \"true\"}"
}));
This outputs: {"context":"{0}","input":"safe-looking\u0022, \u0022bypassAuth\u0022: \u0022true\u0022}"}
Link: https://dotnetfiddle.net/Lm8jkR |
Also, DoS is a legitimate concern, and of course using a safe language makes other consequences less dire. I wasn't using sprintf() in my example for nothing.
> Naively printing potentially malicious input to places like the console is still quite dangerous, no matter how much you escape it! Logs can be captured into systems that then paste it directly into HTML.
This is exactly my point: when you include untrusted input into another string, even if you escape the untrusted input correctly for the desired format of that string, the entire output is now untrusted, and generally can't be further processed safely. Yours is a perfect example: you can escape the user input to make sure it is formatted safely, but you can't at this point tell in what other ways it should be escaped for other systems that may process it (for example, even printing to the actual console like this may be unsafe, as the user input may include terminal control characters).
Even this problem is still simple if we can assume that, say, anything printed to the console that later needs to be displayed in HTML should be considered an HTML string - it just becomes a simple responsibility of the log collector to properly escape the log lines as HTML content.
The problem is much harder if the intention is to actually control the HTML output through log lines (say, adding new-lines through br in your log statements, or emphasis or whatever). If that is a necessary component of your system, you need to re-architect this so that log lines themselves are no longer simple strings, but are structured so that any user-controlled input is kept separate from the trusted application-control formatting
Say, instead of logging
you would log and the log collector would need to know to recombine it into the original string, escaping the untrusted part as needed, before outputting it to HTML as Edit: interestingly, I had to use <script\> instead of a normal opening script tag, as HN would give me a TLS error if the comment contains the normal opening script tag...Wonder if there is some input sanitization going on here as well.