Hacker News new | ask | show | jobs
by vlovich123 479 days ago
Was Box::leak ever considered unsafe? std::mem::forget is very similar to that.

Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks. I don’t know the discussion that was ongoing in the community but I definitely appreciate them taking a pragmatic approach and cutting scope and going for something achievable.

1 comments

Box::leak was added two years later in November 2017 https://github.com/rust-lang/rust/commit/360ce780fdae0dcb31c...

>Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks.

They aren't safety issues according to Rust's definition, but Rust's definition of "unsafe" is basically just "whatever Rust prevents". But that is just begging the question: they don't stop being serious safety issues just because Rust can't prevent them.

If Rust said it dealt with most safety issues, or the most serious safety issues, or similar, that would be fine. Instead the situation is that they define data races as unsafe (because Rust prevents data races) but race conditions as safe (because Rust does not prevent them in general) even though obviously race conditions are a serious safety issue.

For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures. And in fact Rust has no cyclic data structures naturally, as far as I am aware: all cyclic data structures require some "unsafe" somewhere, even if it is inside RefCell/Rc in most cases. So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?

> Rust's definition of "unsafe" is basically just "whatever Rust prevents".

It's not that circular.

Rust defines data races as unsafe because they can lead to reads that produce corrupt values, outside the set of possibilities defined by their type. It defines memory leaks as safe because they cannot lead to this situation.

That is the yardstick for what makes something safe or unsafe. It is the same yardstick used by other memory-safe languages- for instance, despite your claims to the contrary, garbage collectors do not and cannot guarantee a total lack of garbage. They have a lot of slack to let the garbage build up and then collect it all at once, or in some situations never collect it at all.

There are plenty of undesirable behaviors that fall outside of this definition of unsafety. Memory leaks are simply one example.

>It defines memory leaks as safe because they cannot lead to this situation.

They can't now. They could up to and almost including 1.0. At that point the consensus was that memory leaks were unsafe and so unsafe code could rely on them not happening. That code was not incorrect! It just had assumptions that were false. One solution was to make those assumptions true by outlawing memory leaks. The original memory leak hack to trigger memory corruption was fairly fiendish in combination with scoped threads (IIRC).

>There are plenty of undesirable behaviors that fall outside of this definition of unsafety. Memory leaks are simply one example.

That is my whole point. It is a useless definition cherry-picked by Rust because it is what Rust, in theory, prevents. It does not precede Rust. Rust precedes it.

>It is the same yardstick used by other memory-safe languages- for instance, despite your claims to the contrary, garbage collectors do not and cannot guarantee a total lack of garbage. They have a lot of slack to let the garbage build up and then collect it all at once, or in some situations never collect it at all.

If it will eventually be collected then it isn't a memory leak.

Most actual safe languages don't let you write integer overflow.

> They can't now. They could up to and almost including 1.0. At that point the consensus was that memory leaks were unsafe and so unsafe code could rely on them not happening. That code was not incorrect!

This is not how it worked, no. It was never memory leaks per se that led to unsoundness there. It was skipping destructors. You could have the exact same unsoundness if you freed the object without running the rest of its destructor first.

That part was the design choice Rust made- make destructors optional and change the scoped threads API, or make destructors required and keep the scoped threads API.

There is an underlying definition of memory safety (or more generally "soundness") that precedes Rust. It is of course defined in terms of a language's "abstract machine," but that doesn't mean Rust has complete freedom to declare any behavior as safe. Memory safety is a particular type of consistency within that abstract machine.

This is why the exact set of undesirable-but-safe operations varies between memory-safe languages. Data races are unsafe in Rust, but they are safe in Java, because Java's abstract machine is defined in such a way that data races cannot lead to values that don't match their types.

Sure, safety is a relative moving target. There’s no way to prevent race conditions unless you have proofs. And then there’s no way to enforce that your proof is written correctly. It’s turtles all the way down. Rust is a Paretto frontier of safety for AOT high performance languages. Even for race conditions I suspect the tools Rust has for managing concurrency-related issues make it less prone to such issues than other languages.

The problem is you’re creating a hypothetical gold standard that doesn’t exist (indeed I believe it can’t exist) and then judging Rust on that faux standard and complaining that Rust chooses a different standard. That’s the thing though - every language can define whatever metrics they want and languages like C/C++ struggle to define any metrics that they win vs Rust.

> For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures

This does not follow. Without any mutation of any kind, you can’t even allocate memory in the first place (how do you think a memory allocator works?). And you can totally get memory leaks without mutation however you narrowly define it because nothing prevents you from having a long-lived reference that you don’t release as soon as possible. That’s why memory leaks are still a thing in Java because there’s technically a live reference to the memory. No cycles or mutations needed.

> So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?

Again, Box::leak is 100% safe and requires no unsafe at all. Same with std::mem::forget. But even if you exclude APIs like that that intentionally just forget about the value, again nothing stops you from retaining a reference forever in some global to keep it alive.

What is a type system except a bunch of proofs? You can encode some program correctness properties into types. Elevating the ones you happen to be able to encode and calling them "safety" and the rest "correctness" is just marketing.

I am not creating a gold standard because as far as I am concerned, it is all just correctness. There aren't morally more and less important correctness properties for general programs: different properties matter more or less for different programs.

>Without any mutation of any kind, you can’t even allocate memory in the first place (how do you think a memory allocator works?).

    data L t = E | C t (L t)
    data N = Z | S N

    nums Z = E
    nums (S n) = C (S n) (nums n)
You cannot express a reference cycle in a pure functional language but they still have allocation.

However I don't know why I brought this up, because you can also eliminate all memory leaks by just using garbage collection - you don't need to have immutable and acyclic data structures.

>Again, Box::leak is 100% safe and requires no unsafe at all. Same with std::mem::forget.

    #[inline]
    pub fn leak(b: Box<T>) -> &'static mut T {
        unsafe { &mut *Box::into_raw(b) }
    }
They are implemented using unsafe. There is no way to implement Box without unsafe.

If you retain a reference in a global then it is NOT a memory leak! The variable is still accessible from the program. You can't just forget about the value: its name is right there, accessible. That is not a memory leak, except by complete abuse of terminology. The concept of "inaccessible and uncollectable memory, which cannot be used or reclaimed" is a useful one. Your definition of a memory leak seems to be... any memory usage at all?

the unsafety is because of the lifetime laundering not because the operation is unsafe. The compiler doesn’t know that the lifetime of the underlying memory because static and decoupled from the lifetime of the consumed Box

And while we’re at it, please explain to me how this hypothetical language that allocates on the heap without mutable state exists without under the hood calling out to the real mutable allocator somewhere.

> If you retain a reference in a global then it is NOT a memory leak!

> Your definition of a memory leak seems to be... any memory usage at all?

It’s just that you’re choosing to define it as not a memory leak. Another definition of memory leak might be “memory that is retained longer than it needs to be to accomplish the intended goal”. That’s because users are indifferent to whether the user code is retaining the reference and forgetting about it or the user code lost the reference and the language did too.

So from that perspective tracing GC systems even regularly leak memory and then go on a hunt trying to reclaim it when they’ve leaked too much.

More importantly as has been pointed out numerous times to you, memory safety is a technical term of art in the field (unlike memory leaks) that specifically is defined as the issues safe Rust prevents and memory leaks very clearly do not fall under that very specific definition.

>the unsafety is because of the lifetime laundering not because the operation is unsafe.

You have missed the point. I said you can't leak memory in safe Rust. That is true. Box::leak isn't safe Rust: it uses the unsafe keyword. This is half the problem with the stupid keyword: it confuses people. I am saying that it requires the trustme keyword and you are saying it isn't inherently incorrect. Rust uses "unsafe" to mean both. But in context it is quite clear what I meant when talking about Box::leak, which you falsely claimed could be written in safe Rust.

>And while we’re at it, please explain to me how this hypothetical language that allocates on the heap without mutable state exists without under the hood calling out to the real mutable allocator somewhere.

What does the implementation have to do with anything? We are talking about languages not implementations. This isn't a difficult concept.

>It’s just that you’re choosing to define it as not a memory leak. Another definition of memory leak might be “memory that is retained longer than it needs to be to accomplish the intended goal”.

That isn't the definition. I am using the only definition of the term that any serious person has ever used.

>That’s because users are indifferent to whether the user code is retaining the reference and forgetting about it or the user code lost the reference and the language did too.

Users are completely irrelevant. It is logically impossible to ever prevent "leaks" that are just the storage of information. That isn't a leak, it is the intentional storage of information by the programmer. So it is a completely useless concept if that is what you want to use. It might be a useful concept in application user experience design or something but we are talking about programming languages.

On the other hand, "memory leaks" is a very useful concept if you use the actual definition because it is almost difficult to even conceive of a memory management strategy that isn't concerned with preventing memory leaks (proper). The "short lived program; free nothing" strategy is the only one I can think of, a degenerate case.

>More importantly as has been pointed out numerous times to you, memory safety is a technical term of art in the field (unlike memory leaks) that specifically is defined as the issues safe Rust prevents

No, it isn't! That is the definition that Rust people choose to use, which nobody used before 2015ish and is only widely used because Rust captured mindshare. It isn't some definition that predated Rust and which Rust magically fell right into.

Go back and look at mailing list threads, forum posts, papers, anything before Rust tried to steal the term "safety". It referred (and properly still refers) to programs. When people complained about manual memory management, the big complaint was that big C++ GUI programs (in particular) leaked memory like sieves. Nobody was particularly concerned about data races except the people implementing concurrency primitives in standard libraries etc. C++ didn't even have a defined memory model or standard atomics. Everyone was relying on x86's strong memory model in code all over the place. The big concern was avoiding manual memory management, memory leaks, and data corruption.

"Safe" didn't mean "has no data races but might have race conditions, has no use after free but might have memory leaks, and might have overflow bugs and SQL injections and improper HTML sanitisation". That would be a truly stupid definition. It meant "correct". The fanatical Rust community came along and tried to redefine "safe" to mean "the things we prevent". Rust's definition makes sense for Rust but it is Rust-specific because it is downstream of what Rust is capable of enforcing. Nobody would a priori come up with the particular subset of correctness properties that Rust happens to enforce and call them "safety". It is transparently a posteriori.