Hacker News new | ask | show | jobs
by Chabsff 812 days ago
> languages that do not currently have this ability

That seems to be a very common position, and one that's super weird to me. C and particularly C++ absolutely have that ability with library support if you know what you are doing.

The only material difference, from my point of view, is that the default behavior of the language is different.

I will fully grant that the path of least resistance being dangerous is a huge issue in C/C++, and one that Rust addresses, but extending that all the way to saying that the language lacks the ability is really excessive.

3 comments

There's a big cultural problem and there are several big technical problems.

Rust has a safety culture, and C++ does not. In Rust's safety culture it was obvious that std::mem::unintialized (an unsafe function) should be deprecated because it's more dangerous than it appears, it's actually hard to use it correctly. That's why today we have the MaybeUninit type. In C++ it was apparently equally obvious that std::span, a brand new type in C++ 20, should not have a safe index operation.

Technically the safe/ unsafe distinction being at the language level makes it hard to fake. You can say your C++ only uses your safe abstractions, but the language itself doesn't care, so without inspecting every part of it to check you're never more than one slip away from catastrophe.

Most importantly in this context, at the language level Rust is committed to this safety distinction. If you write code where Rust's compiler can't see why it's OK, the compiler rejects your program. C++ requires that a conforming compiler must instead accept programs unless it can show why they're wrong. These are two possible ways to cut the Gordion knot of Rice's Theorem, but they have very different consequences.

You can't encapsulate safety in C or C++. There's no `unsafe` keyword like Rust (or like Modula 3). If you have to say "if you know what you're doing," then you haven't encapsulated anything. There really is a categorical difference here. It's not excessive at all. It's the entire point.

Now if I were to say something like, "Rust's safety means that you can never have UB anywhere ever and CVEs will never happen for anything if you use Rust." Then yes, that's excessive. But to say that Rust can encapsulate `unsafe` and C and C++ cannot? I don't see how that's excessive. It's describing one of the most obvious differences between the programming languages.

You can restrict yourself to particular subsets of C (I'm thinking about MISRA) or C++, but these usually come with even more significant trade offs than Rust. And I'm not aware of any such subset that provides the ability to encapsulate safety in a way that lets folks not using that subset benefit from it in a way that is impossible to misuse (as a matter of an API guarantee).

Only to add some historical context, ESPOL/NEWP for Burroughs B5000 in 1961 were one of the first systems programming languages with unsafe, many others followed upon that.

Burroughs B5000 had an additional feature for executables using unsafe code that we only have nowadays on managed runtimes like Java and CLR, binaries with unsafe code were tainted and required someone with admin access to enable them for execution.

Regarding C and C++, Visual Studio, Clion and clang tidy are the best we have in terms of tooling for the general public supporting the Core Guidelines (including lifetime checks), and they are still relatively basic in what they can actually validate.

AIUI, Modula-3 provided an ability to actually encapsulate unsafety, in that the concept was elevated to the level of interfaces. Did any language prior to Modula-3 have that capability?

I think that's fundamentally different---although related---to just having an `unsafe` keyword. To take something I know well, Go has an `unsafe` package that acts as a sort of unsafe keyword. If we ignore data races, you can say that a Go program can't violate memory safety if there is no use of the `unsafe` package.

The problem though is that you can't really build new `unsafe` abstractions. You can't write a function that might cause UB on some inputs in a way that requires the caller to write `unsafe`. (You can do this by convention of course, e.g., by putting `Unsafe` in the name of the function.)

In Rust, `unsafe` doesn't just give you the ability to, e.g., dereference raw pointers. It also is required in order to call other `unsafe` functions. You get the benefit of composition so that you can build arbitrary abstractions around `unsafe` with the compiler's support.

My understanding is that Modula-3 supported this style of encapsulation (which is what I was talking about in this thread). What languages prior to Modula-3 supported it, or was Modula-3 the first?

Starting with that 1961 example, ESPOL/NEWP.

Since Unisys still sells Burroughs, nowadays ClearPath MCP, you can get the latest NEWP manual here, section 8.

https://public.support.unisys.com/framework/publicterms.aspx...

Followed by Mesa/Cedar (CHECKED, TRUSTED, UNCHECKED), Modula-2 (IMPORT SYSTEM), the languages of Oberon linage (which follow up on the IMPORT SYSTEM approach), Ada (using Unchecked),....

In the languages that use the IMPORT SYSTEM approach, the compiler can mark the module as unsafe, and anything that might depend on it.

Some the Modula-3 folks worked previously on Cedar at Xerox, by the way.

Mesa - http://www.bitsavers.org/pdf/xerox/mesa/5.0_1979/documentati...

Cedar - http://www.bitsavers.org/pdf/xerox/parc/cedar/Cedar_7.0/09_C...

Very interesting. Thank you.
> C and particularly C++ absolutely have that ability with library support if you know what you are doing.

I won't speak to C++, as it's a very different language now since the last time I used it. I've been writing C for more than 20 years, and I still make mistakes. And there's nothing keeping me from accidentally doing something unsafe outside my unsafe abstraction, aside from my own perfection at never making mistakes (yeah, right).

Rust requires you to be explicit about the unsafe things you do. And, realistically, even when I'm building a safe interface on top of necessarily-unsafe code, the unsafe portions aren't even that large compared to the entirety of the abstraction. That makes things much easier to audit, and the compiler tells me which sections of code I need to pay more attention to.

To me, this is lacking the ability. "If you know what you are doing" is a laughable constraint. Even people who theoretically do (and I suspect programming ability is a lot like people's self-reported skill at driving a car) still make mistakes sometimes.