Linux is probably the most carefully constructed C codebase in existence and still falls in to C pitfalls semi regularly. Every other project has no hope of safely using C. It's looking more and more like Linux should be carefully rewritten in Rust. It's a monstrous task but I can see it happening over the next decade.
That didn't answer anything. If you want to do anything with your input, you have to run it through a parser. Doesn't matter if it's untrusted or not. Your only options are ignoring the input, echoing it somewhere, or parsing it.
Right, but you do have the option of writing that parser in a language other than C. And given how often severe security issues are caused by such parsers written in C, one probably ought to choose a different language, or at least use C functions and string types that store a length rather than relying on null termination.
If you have the input in a buffer of known length in C, hand it off to a (dynamic or static) library written in a safe language, and get back trusted parsed output, then there's much less attack surface in your C code.
The issue in many of these cases is there appears to be no canonical safe way to know the length of the input in C, and people apparently screw up keeping track of the lengths of the buffers all the time.
1. Well don’t write in C then if your program is security critical or going to be exposed over a network. Sure, there are some targets that require C, but that’s not the case for the vast majority of platforms running OpenSSL.
2. That’s still less of a problem as the C will then be handling trusted data validated by the safe langauge.
If you make argument 2) could you explain how writing a parser is more security critical than any other code that has a (direct or indirect) interaction with the network? At least recursive descent parsers are close to trivial. I usually start by writing a "next_byte" function and then "next_token". You'll have to look very hard to find any pointer code there. It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.
I agree that the original statement encourages that interpretation, but I think it admits the interpretation that the parser itself is in C and I think that is what was intended.
Even if application constraints mean you can't write a parser in another language that's linkable to C, why couldn't you use a parser generator that outputs C?
>parsers for untrusted input in C