Hacker News new | ask | show | jobs
by pjc50 3856 days ago
Possible, but not necessarily easy; what tools do you use to sweep C++ code for 'unsafe' constructs?
2 comments

I'm interested in the answer to that, too. A member of the Chrome team asked what static analysis or advanced verification tools I thought they could use in a significant C++ project. Digging around, I think I just found one, limited one plus two ways of doing Design-by-Contract (asserts & OOP). That was it. Not inspiring lol.

Now, there has been work on type-safe or memory-safe version of C++. They're non-standard. They also get smashed when a memory error occurs and that will happen. So, suggesting to rely on language-based isolation in C++ is a more a joke than something worth trying.

Good example of work on C++ safety:

https://www.cis.upenn.edu/~eir/papers/2013/ironclad/paper.pd...

> they could use in a significant C++ project.

The difference with Chromium is you've got all that integration with standard C runtimes that are inherently dicey. Unikernels are a different story.

C runtimes do add extra issues but that's not relevant to my comment. I said I largely came up dry on methods to prove correctness of C++ code. Quite important if one is considering C++ vs other language for a robust application and/or unikernel. C++ would be a bad choice if language itself was supposed to contribute to robustness.
> C runtimes do add extra issues but that's not relevant to my comment.

I really don't see how it is. If you're running on a desktop platform, you've got a huge exposed surface that is working with raw pointers to proprietary logic. That makes provable correctness a far, far more complex problem.

> I said I largely came up dry on methods to prove correctness of C++ code.

It is easy to implement a smart pointer that the compiler can prove will always do bounds checking before dereferencing. The hard part is proving that all the code that uses raw pointers is doing the same thing.

"I really don't see how it is. If you're running on a desktop platform, you've got a huge exposed surface that is working with raw pointers to proprietary logic. That makes provable correctness a far, far more complex problem."

The point is that C++ itself is damn-near impossible to analyze on the cheap and without much false positives. That's before I even considered the C interface. Then there's C level problems that have been the reason I've opposed it forever. At least there's tons of stuff to draw on in analysing, transforming, etc that code. I'd go with a C subset or Java/Ada subset with high-integrity runtime any day over C++.

"It is easy to implement a smart pointer that the compiler can prove will always do bounds checking before dereferencing. The hard part is proving that all the code that uses raw pointers is doing the same thing."

I'll take your word on the smart pointers doing bounds-checks as I'm not up-to-date on all the techniques of C++ developers. Academics need to do a fresh take on that with assessments vs particular risks & compared to current languages. Meanwhile, most in safety-critical development that I know of don't use C++ because it's too complex and unsafe per them. I know there's MISRA subset and some other stuff. There are people who use it with thorough testing and source-to-object validation. Mostly not in use, though.

So, do you have any resources showing that C++ code is safe and analysable if one just uses smart pointers? And what tools and subset you use to do that? If you have that and proof it works, then that could help a lot of developers using C that aren't aware of it. I'm being serious as much as I am challenging your claim. If you have it, I'll consider it.

> I'll take your word on the smart pointers doing bounds-checks as I'm not up-to-date on all the techniques of C++ developers. Academics need to do a fresh take on that with assessments vs particular risks & compared to current languages.

That's kind of already happened. Stroustrup has done a whole ton of work in that area with Concepts.

> Meanwhile, most in safety-critical development that I know of don't use C++ because it's too complex and unsafe per them.

It turns out that provable correctness invariably involves a fair bit of complexity (you are basically compiling a mathematical proof). People use Haskell and Coq to really do it right, and --surprise-- those turn out to be hard for programmers to learn.

A lot of other, more popular, high level languages are actually terrible for provable correctness, even if they are better for proving memory safety. C++ isn't as effective for the job, but it has the advantage of being great for integrating in with the platform. It is a trade off, but one that is well worth while.

> So, do you have any resources showing that C++ code is safe and analysable if one just uses smart pointers?

This stuff goes back a way, but stemmed from Modern C++ Design. There is a whole world of policy based design where you use the type system (much as with Haskell and Coq) to enforce declarative policies.

The quick thought experiment would be something like this:

    template <class T>
    struct SafeRef {
        void check() { ... }
        operator T&() { check(); return *x; }
        operator const T&() const { check(); return *x; }
    private:
        T* x;
    };
You can override operator-> to make it behave more like a proper pointer. CRTP gives you some pretty powerful ways of getting the job done too.
...and just to provide an example of how one does verification with for C: https://galois.com/blog/2013/09/high-assurance-base64/
Around memory safety, it would be simply be dereferencing any raw pointer. You'd whitelist a set of smart pointer classes that manage that and you can pretty much just use the clang parser to catch it.