Hacker News new | ask | show | jobs
by tialaramex 316 days ago
C++ already illustrates this idea you're talking about and we know exactly where this goes. Rust's false positives are annoying, so programmers are encouraged to further improve the borrowck and language features to reduce them. But the C++ or Zig false negatives just means your program malfunctions in unspecified ways and you may not even notice, so programmers are encouraged to introduce more and more such cases to the compiler.

The drift over time is predictable, compared to ten years ago Rust has fewer false positives, C++ has more false negatives.

You are correct to observe that there is no middle choice here, that's Rice's Theorem, non-trivial semantic correctness is Undecidable. But I would argue we already know what you're calling the "false positive" scenario is also not useful, we're just not at the point where people stop doing it anyway.

1 comments

> C++ already illustrates this idea you're talking about and we know exactly where this goes.

No, it doesn't. Zig is safer than C++ (and it's much simpler, which also has an effect on correctness).

Making up some binary distinction and then deciding that because C++ falls on the same side of it as Zig (except it doesn't, because Zig eliminates out-of-bounds access to the same degree as Rust, not C++) then what applies to one must apply to the other. There is simply no justification to make that equivalence.

> There is no middle choice here, that's Rice's Theorem, non-trivial semantic correctness is Undecidable.

That's nothing to do with Rice's theorem. Proving some properties with the type system isn't a general algorithm; it's a proof you have to work for in every program you write individually. There are languages (Idris, ATS) that allow you to prove any correctness property using the type system, with no false positives. It's a matter of the effort required, and there's nothing binary about that.

To get a sense of the theoretical effort (the practical effort is something to be measured empirically, over time) consider the set of all C programs and the effort it would take to rewrite an arbitrary selection of them in Rust (while maintaining similar performance and footprint characteristics). I believe the effort is larger than doing the same to translate a JS program to a Haskell program.

> There is simply no justification to make that equivalence.

I explained in some detail exactly why this equivalence exists. I actually have a small hope that this time there are enough people who think it's a bad idea that we don't have to watch this play out for decades before the realisation as we did with C and C++.

Yes it's exactly Rice's Theorem, it's that simple and that drastic. You can choose what to do when you're not sure, but you can't choose (no matter how much effort you imagine applying) to always be sure†, that Undecidability is what Henry Rice proved. The languages you mention choose to treat "not sure" the same as "nope", like Rust does, you apparently prefer languages like Zig or C++ which instead treat "not sure" as "it's fine". I have explained why that's a terrible idea already.

The underlying fault, which is why I'm confident this reproduces, is in humans. To err is human. We are going to make mistakes and under the Rust model we will curse, perhaps blame the compiler, or the machine, and fix our mistake. In C++ or Zig our mistake compiles just fine and now the software is worse.

† For general purpose languages. One clever trick here is that you can just not be a general purpose language. Trivial semantic properties are easily decided, so if your language can make the desired properties trivial then there's no checking and Rice's Theorem doesn't apply. The easy example is, if my language has no looping type features, no recursive calls, nothing like that, all its programs trivially halt - a property we obviously can't decidably check in a general purpose language.

> I explained in some detail exactly why this equivalence exists.

No, you assumed that Zig and C++ are equivalent and concluded that they'll follow a similar trajectory. It's your premise that's unjustified.

A problem you'd have to contend with is that Rust is much more similar to C++ than Zig in multiple respects, which may matter more or less than the level of safety when predicting the language trajectory.

> But you can't choose (no matter how much effort you imagine applying) to always be sure

That is not Rice's theorem. You can certainly choose to prove every program correct. What you cannot do is have a general mechanism that would prove all programs in a certain language correct.

> One clever trick here is that you can just not be a general purpose language.

That's not so much a clever trick as the core of all simple (i.e. non-dependent) type systems. Type-safety in those languages then trivially implies some property, which is an inductive invariant (or composable invariant) that's stronger than some desired property. E.g. in Rust, "borrow/lifetime-safety" is stronger than UAF-safety.

However, because an effort to prove any property must exist, we can find it for some language that trivially offers it by looking at the cost of translating a correct program in some other language that doesn't guarantee the property to one that does. The reason why it's more of a theoretical point than a practical one is because it could be reasonably argued that writing a memory-safety program in C is harder than doing it in Rust in the first place, but either way, there's some effort there that isn't there when writing the program in, say, Java.

> No, you assumed that Zig and C++ are equivalent and concluded that they'll follow a similar trajectory. It's your premise that's unjustified.

They did not say Zig and C++ are equivalent

And yet, in reality, Rust is also on the "if I am not sure I simply attest that it is fine" side on the fence.
I've been hearing about how I'll inevitably write all this unsafe Rust for... four years now.

Some time back I checked and I had written exactly one unsafe block, and so I inspected it again and I realised two things:

1. It was no longer necessary, Rust could now just do this safely. I rewrote it in safe Rust.

2. It was technically Undefined Behaviour, predictably given the chance to shoot myself in the foot that's exactly what I had done. Like a lot of C and C++ it likely wouldn't in fact blow my foot off in any real scenario, but who knows? Not me, that's for sure.

You are already narrowing this down to only memory safety, which is part one of the Rust fallacies.
Ah yes, "But what about other safety?". An entire year of hand wringing from C++ people was predicated on this. In one of his rambling proposal papers Bjarne listed all manner of exciting different kinds of safety he'd imagined and which, he assured us, C++ was already almost able to achieve thanks to his wisdom and foresight.

And every single item on his list of course requires the thing C++ doesn't have, memory safety. You can't write software which has any non-trivial properties when it has unconstrained Undefined Behaviour. It really shouldn't be this hard but I have reluctantly accepted that this "argument" is not made in good faith.

Which is why there is an effort to formally verify the unsafe use in the Rust standard library.

I would also say that unsafe causes a very different human reaction.

When like Zig, C or C++ everything is potentially unsafe then you can't scrutinize everything.

When submitting a PR in Rust containing unsafe code everyone wants to understand what happens because it is both rare, and everyone are cautious about the dangers posed. The first question on everyone's mind always is: Does this need unsafe?

> When like Zig, C or C++ everything is potentially unsafe

It is not true that in Zig "everything is potentially unsafe". Zig offers bounds safety, which, BTW, eliminates the most dangerous kind of memory unsafety (https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html).

Suppose I have a self-contained Zig project and it has a nasty memory safety bug - how can I identify where the cause might be? What parts of my project source are potentially unsafe ?

You've said it's not everything, so, what's excluded? What can I rule out?

What is your reason to claim zig is safer than c++?
Bounds safety by default, nullability is opt-in and checks are enforced by the type-system, far less "undefined behaviour", less implicit integer casting (the ergonomics could still use some work here), etc.

This is on top of the cultural part, which has led to idiomatic Zig being less likely to heap allocate in the first place, and more likely to consider ownership in advance. This part shouldn't be underestimated.

> This part can't be underestimated.

You presumably intend "shouldn't be underestimated" rather than "can't be". I agree that culture is crucial, but the technology needs to support that culture and in this respect Zig's technology is lacking. I would love to imagine that the culture drives technology such that Zig will fix the problem before 1.0, but Zig is very much an auteur language like Jai or Odin, Andrew decides and he does not seem to have quite the same outlook so I do not expect that.

> You presumably intend "shouldn't be underestimated" rather than "can't be".

Good call, I've fixed that.

> Zig is safer than C++

Maybe if someone bends over backwards to rationalize it, but not in any real sense. Zig doesn't have automatic memory management or move semantics.

In C++ you can put bounds checking in your data structures and it is already in the standard data structures. You can't build RAII and moves into zig.

> Maybe if someone bends over backwards to rationalize it, but not in any real sense.

In a simple, real sense. Zig prevents out-of-bounds access just as Rust does; C++ doesn't. Interestingly, almost all of Rust's complexity is invested in the less dangerous kind of memory unsafety (https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html).

> You can't build RAII and moves into zig.

So RAII is part of the definition of memory safety now?

Why not just declare memory safety to be "whatever Rust does", say that anything that isn't exactly that is worthless, and be done with that, since that's the level of the arguments anyway.

We could, of course, argue over which of Rust, Zig, and C++ offers the best contribution to correctness beyond the sound guarantees they make, except these are empirical arguments with little empirical data to make any determination, which is part of my point.

Software correctness is such a complicated topic and, if anything, it's become more, not less, mysterious over the decades (see Tony Hoare's astonishment that unsound methods have proven more effective than sound methods in many regards). It's now understood to be a complicated game of confidence vs cost that depends on a great many factors. Those who claim to have definitive solutions don't know what they're talking about (or are making unfounded extrapolations).

C++ doesn't.

Then why do my data structures detect if I go out of bounds?

Interestingly, almost all of Rust's complexity is invested in the less dangerous kind of memory unsafety

I didn't say anything about rust.

So RAII is part of the definition of memory safety now?

Yes. You can clean up memory allocations automatically with destructors and have value semantics for memory that is on the heap.

Why not just declare memory safety to be "whatever Rust does", say that anything that isn't exactly that is worthless, and be done with that, since that's the level of the arguments anyway.

Why are you talking about rust here? Focus on what I'm saying.

We could, of course, argue over which of Rust, Zig, and C++

if anything, it's become more, not less, mysterious over the decades

Says who?

I don't care about rust or zig, I'm saying that these are solved problems in C++ and I don't have to deal with them. Zig does not have destructors and move semantics.

> Then why do my data structures detect if I go out of bounds?

Because you have iterator debugging and/or assertions turned on and are only using non-primitive data structures (e.g. std::vector, std::array).

Zig does the thing that Rust and Go do where it makes the primary primitive for pointers to chunks of memory (slices) bounds checked. You can opt out with optimization settings, but I think most programs will build in "safe release" mode unless they're very confident in their test coverage.

It's strictly better than C++, because in practice codebases are passing lots of `(data, len)` params around no matter how strongly you emphasize in your style guide to use `std::span`. The path of least resistance in Zig, including the memory allocator interface, bundles in language-level bounds checking.

>I think most programs will build in "safe release" mode

Do you have any citations to support this 'safe release' theory? Like there are not many Zig applications and not many of them document their decisions. One i could find [1] does not mention safe anywhere.

1. https://ghostty.org/docs/install/build

Memory leaks are unrelated to memory safety. That is to say, code that leaks memory is memory safe. So I'm not sure what RAII is supposed to help with.

A problem not solved in C++ is the need to reserve a single bit-pattern per type that can be moved from, to indicate that it has been moved from (and is not a valid value for any other purpose).

> Then why do my data structures detect if I go out of bounds?

I didn't mean you can't write C++ code that enforces that, I said C++ itself doesn't enforce it.

> Yes. You can clean up memory allocations automatically with destructors and have value semantics for memory that is on the heap.

Surely there are other ways to do that. E.g. Zig has defer. You can say that you may forget to write defer, which is true, but the implicitness of RAII has cause (me, at least) many problems over the years. It's a pros-and-cons thing, and Zig chooses the side of explicitness.

> Why are you talking about rust here? Focus on what I'm saying.

You're right, sorry :)

> Says who?

Says most people in the field of software correctness (and me https://pron.github.io). In the seventies, the prevalent opinion was that proofs of correctness would be the only viable approach to correctness. Since then, we've learnt two things, both of which were surprising.

The first was new results in the computational complexity of model checking (not to be confused with the computational complexity of model checkers; we're talking about the intrinsic computation complexity of the model checking problem, i.e. the problem of knowing whether a program satisfies some correctness property, regardless of how we learn that). This included results (e.g. by Philippe Schnoebelen) showing that even though there would be the reasonable expectation that language abstractions could make the problem easier, even in the worst case - it doesn't.

The second was that unsound techniques, including engineering best practices, have proven far more effective than was thought possible in the seventies. This came as quite a shock to formal methods people (most famously, Tony Hoare, who wrote a famous paper about it).

As a result, the field of software correctness has shifted its main focus from proving program correct to finding interesting confidence/cost tradeoffs to reduce the number of bugs, realising that there's no single best path to more correctness (as far as we know today).

> I'm saying that these are solved problems in C++ and I don't have to deal with them. Zig does not have destructors and move semantics.

That's true, but these are not memory safety guarantees. These are mechanisms that could mitigate bugs (though perhaps cause others), and Zig has other, different mechanisms to mitigate bugs (though perhaps cause others). E.g. see how easy it is to write a type-safe printf in Zig compared to C++, or how Zig handles various numeric overflow issues compared to C++. So it's true that C++ has some features we may find helpful that Zig doesn't and vice-versa, we can't judge which of them leads to more correct programs. All I said was that Zig offers more safety guarantees than C++, which it does.

Zig has defer.

And C has free, but you have to remember to use it and use it correctly every single time instead of the memory working by default with no intervention.

Says most people in the field of software correctness

Not true, the last 30 years have had much safer languages than before java, scripting languages, modern C++ and rust.

That's true, but these are not memory safety guarantees.

Pragmatically they mean you don't have to worry about bounds checking or memory deallocation and it stops being a problem. Zig doesn't have this and it doesn't have safety guarantees either.

Unless you actually use the simplicity to apply formal methods I don't think simplicity make a language safer. The exact opposite. You can see it play out in the C vs C++ arena. C++ is essentially just a more complex C. But I trust modern C++ much more in terms of memory safety.
> Unless you actually use the simplicity to apply formal methods I don't think simplicity make a language safer.

That depends what you mean by "safer", but it is an empirical fact that unsound methods (like tests and code reviews) are extremely effective at preventing bugs, so the claim that formal methods are the only way is just wrong (and I say this as a formal methods guy, although formal methods have come a long way since the seventies, when we thought the point was to prove programs correct).

> The exact opposite. You can see it play out in the C vs C++ arena. C++ is essentially just a more complex C. But I trust modern C++ much more in terms of memory safety.

I don't understand the logical implication. From the fact that there exists a complicating extension of a language that's safer in some practical way than the original you conclude that complexity always offers correctness benefits? This just doesn't follow logically, and you can immediately see it's false because Zig is both simpler and safer than C++ (and it's safer than C++ even if its simplicity had no correctness benefits at all).

> That depends what you mean by "safer", but it is an empirical fact that unsound methods (like tests and code reviews) are extremely effective at preventing bugs, so the claim that formal methods are the only way is just wrong (and I say this as a formal methods person)

I agree that tests and reviews are somewhat effective. That's not the point. The point is that if you look at the history of programming languages simplicity in general goes against safety. Simplicity also goes against human understanding of code. C and assembly are extremely simple compared to java, python, C#, typescript etc. yet programs written in C and assembly are much harder to understand for humans. This isn't just a PL thing either. Simplicity is not the same as easy, it often is the opposite.

> I don't understand the logical implication. From the fact that there exists a complicating extension of a language that's safer in some practical way than the original you conclude that complexity always offers correctness benefits? This just doesn't follow logically, and you can immediately see it's false because Zig is both simpler and safer than C++ (and it's safer than C++ even if its simplicity had no correctness benefits at al

It's the greatest example of you take a simple language, you add a ton of complexity and it becomes more safe. You are right that zig is simpler and safer, but it's a green field language. Else I might as well say rust is more safe than zig and also more complex. The point is as to isolate simplicity as the factor as much as possible.

I would even say that zig willingly sacrifices safety on the alter of simplicity.

> The point is that if you look at the history of programming languages simplicity in general goes against safety... C and assembly are extremely simple compared to java, python, C#, typescript

But Java and Python are simpler yet safer than C++, so I don't understand what trend you can draw if there are examples in both directions.

> It's the greatest example of you take a simple language, you add a ton of complexity and it becomes more safe.

But I didn't mean to imply that's not possible to add safety with complexity. I meant that when the sound guarantees are the same in two languages, then there's an argument to be made that the simpler one would be easier to write more correct programs in. Of course, in this case Zig is not only simpler than C++, but actually offers more sound safety guarantees.

I do not find C code harder to understand than C++ - quite the opposite.