Hacker News new | ask | show | jobs
by musicnarcoman 480 days ago
While I am only a Rust novice it seems to me like the "2.2 Item 11: Implement the Drop trait for RAII patterns" could use some kind of mention of Drop-leaks. I learned about it at https://doc.rust-lang.org/nightly/nomicon/leaking.html
3 comments

Rust destructors are interesting.

- You can't export a reference to the thing you are dropping. You can do that in C++. This prevents "re-animation", where something destroyed comes back to life or is accessed beyond death. Microsoft Managed C++ (early 2000s), supported re-animation and gave it workable semantics. Bad idea, now dead.

- This is part of why Rust destructors cannot run more than once. Less than once is possible, as mentioned above.

- There's an obscure situation with Arc and destructors. When an Arc counts down to 0, the destructor is run. Exactly once. However, Arc countdown and destructor running are not an atomic operation. It is possible for two threads to see an Arc in a strong_count == 1 state just before the Arc counts down. Never check strong_count to see if you are "the last owner". That creates a race condition.[1] I've seen that twice now. I found race conditions that took a day of running to hit. Use strong_count only for debug print.

- A pattern that comes up in GUI libraries and game programming involves objects that are both in some kind of index and owned by Arcs. On drop, the object should be removed from the index. This is a touchy operation. The index should use weak refs, and you have to be prepared to get an un-upgradable Weak from the index.

- Even worse is the case where dropping an object starts a deletion of something else. If the second deletion can't be completed from within the destructor, perhaps because it requires a network transaction, it's very easy to introduce race conditions.

[1] https://github.com/rust-lang/rust/issues/117485

> - You can't export a reference to the thing you are dropping. You can do that in C++. This prevents "re-animation", where something destroyed comes back to life or is accessed beyond death. Microsoft Managed C++ (early 2000s), supported re-animation and gave it workable semantics. Bad idea, now dead.

>

> - This is part of why Rust destructors cannot run more than once. ...

This is a very backwards way to describe this, I think. Managed C++ only supported re-animation for garbage collected objects, where it is still today a fairly normal thing for a language to support. This is why these "destructors" typically go by a different name, "finalizers." Some languages allow finalizers to run more than once, even concurrently, but this is again due to their GC design and not a natural thing to expect of a "destructor."

The design of Drop and unmanaged C++ destructors is that they are (by default) deterministically executed before the object is deallocated. Often this deallocation is not by `delete` or `free`, which could perhaps in principle be cancelled, but by a function return popping a stack frame, or some larger object being freed, which it simply does not make sense to cancel.

> Never check strong_count to see if you are "the last owner".

This made me think of the `im` library[0] which provides some immutable/copy on write collections. The docs make it seem like they do some optimizations when they determine there is only one owner:

> Most crucially, if you never clone the data structure, the data inside it is also never cloned, and in this case it acts just like a mutable data structure, with minimal performance differences (but still non-zero, as we still have to check for shared nodes).

I hope this isn't prone to a similar race condition!

[0] https://docs.rs/im/15.1.0/im/index.html

The way to do this while avoiding race conditions seems to be `Arc::into_inner` or `Arc::get_mut`; for instance, the docs for `Arc::try_unwrap` mention a possible race condition, and recommend using `Arc::into_inner` to avoid it: https://doc.rust-lang.org/std/sync/struct.Arc.html#method.tr...
Managed C++ is pretty much around, kind of, as it got replaced by C++/CLI in .NET 2.0, is still used by many of us instead of dealing with P/Invoke annotations, is required by WPF infrastructure, and currently is on C++20 support level.
The important note here is that you can't rely on Drop running in order to satisfy the SAFETY comment of an unsafe block. In practice, in safe Rust, this knowledge shouldn't really change how you write your code.
The big foot-gun here is mem::forget rather than Drop itself. Although yeah it is pretty surprising that is considered safe.
It’s not that surprising when you consider that “unsafe” only concerns itself with memory safety. mem::forget is not unsafe from that perspective.

> In the past mem::forget was marked as unsafe as a sort of lint against using it, since failing to call a destructor is generally not a well-behaved thing to do (though useful for some special unsafe code). However this was generally determined to be an untenable stance to take: there are many ways to fail to call a destructor in safe code. The most famous example is creating a cycle of reference-counted pointers using interior mutability.

Leaking memory is unsafe. It was considered unsafe for decades: a prime example of the sort of problem you get in C or C++ that you avoid with automatic memory management. Lots of real crashes, stability issues and performance issues have been caused by memory leaks over the years.

Rust initially advertised itself as preventing leaks, which makes sense as it is supposed to have the power of automatic memory management but without the runtime overhead.

Unfortunately, shortly before Rust's release it was discovered that there were some APIs that could cause memory corruption in the presence of memory leaks. The decision was made that memory leaks would be too complicated to fix before 1.0: it would have had to have been delayed. So the API in question was taken out and Rust people quietly memory-holed the idea that leak freedom has ever been considered part of memory safety.

I think that's a retcon. Rust people did not "decide that leaking is safe" all of a sudden, that's cart-before-horse. Rust's memory model was still in its early stages back then and there was a belief (in hindsight, a mistaken belief) that destructors could be used as a means to guarantee memory safety. This turned out to be poorly reasoned and so, to preserve a consistent model of safety for other code, it was decided that having safety rely on the invocation of destructors was unsound. It's not possible to do this without also having leaks be safe, so that's the world as it is.

If "is leaking memory safe?" is an issue of contention for you, I'd suggest that it's a good idea to do some reading on what memory safety is (I mean that in all sincerity, not as a dunk). Memory safety, at least by the specific and highly useful definition used by compiler developers, is intimately entangled with undefined behaviour, but memory leaking sits entirely outside this sphere. This is as true in C and C++ as it is in Rust.

Another example of how your parent isn't really being accurate, memory leaks are also possible in garbage collected languages, yet they have been considered memory safe since well before Rust even existed.

It's not as if Rust invented the term "memory safety" or gets to define it.

Memory leaks are not possible in garbage collected languages unless you retain references to data, but by definition that isn't a memory leak, that is exactly the behaviour that you want.

Memory leaks are situations where memory is unrecovered despite there being no path to it from any active thread.

The difference is that leaking is not UB, the worst case is an OOM situation, which at worst causes a crash, not a security exploit. Crashing is also considered to be safe in rust, panicking is common for example when something bad happens.
Undefined behaviour is behaviour not defined by the language. So obviously Rust can define or undefine whatever it likes. It is not a sensible argument to say that something is safe because its behaviour is defined, or unsafe because it is undefined, when the whole point is that Rust's chosen definition of safety is just marketing.

I admit a better example is race conditions.

no, undefined behavior not just behavior that is not covered by the language definition. undefined behavior is term of art largely taken from C/C++, basically meaning that correct programs are assumed not to have these behaviors. for example, see https://en.cppreference.com/w/c/language/behavior. the definition of ub is not "just marketing". many major security vulnerabilities stem from having ub (out of bounds access, use after free). the point of rust is pretty much that you have to try hard to have ub, whereas in c/c++, it's basically impossible to not have ub.
Was Box::leak ever considered unsafe? std::mem::forget is very similar to that.

Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks. I don’t know the discussion that was ongoing in the community but I definitely appreciate them taking a pragmatic approach and cutting scope and going for something achievable.

Box::leak was added two years later in November 2017 https://github.com/rust-lang/rust/commit/360ce780fdae0dcb31c...

>Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks.

They aren't safety issues according to Rust's definition, but Rust's definition of "unsafe" is basically just "whatever Rust prevents". But that is just begging the question: they don't stop being serious safety issues just because Rust can't prevent them.

If Rust said it dealt with most safety issues, or the most serious safety issues, or similar, that would be fine. Instead the situation is that they define data races as unsafe (because Rust prevents data races) but race conditions as safe (because Rust does not prevent them in general) even though obviously race conditions are a serious safety issue.

For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures. And in fact Rust has no cyclic data structures naturally, as far as I am aware: all cyclic data structures require some "unsafe" somewhere, even if it is inside RefCell/Rc in most cases. So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?

> Rust's definition of "unsafe" is basically just "whatever Rust prevents".

It's not that circular.

Rust defines data races as unsafe because they can lead to reads that produce corrupt values, outside the set of possibilities defined by their type. It defines memory leaks as safe because they cannot lead to this situation.

That is the yardstick for what makes something safe or unsafe. It is the same yardstick used by other memory-safe languages- for instance, despite your claims to the contrary, garbage collectors do not and cannot guarantee a total lack of garbage. They have a lot of slack to let the garbage build up and then collect it all at once, or in some situations never collect it at all.

There are plenty of undesirable behaviors that fall outside of this definition of unsafety. Memory leaks are simply one example.

Sure, safety is a relative moving target. There’s no way to prevent race conditions unless you have proofs. And then there’s no way to enforce that your proof is written correctly. It’s turtles all the way down. Rust is a Paretto frontier of safety for AOT high performance languages. Even for race conditions I suspect the tools Rust has for managing concurrency-related issues make it less prone to such issues than other languages.

The problem is you’re creating a hypothetical gold standard that doesn’t exist (indeed I believe it can’t exist) and then judging Rust on that faux standard and complaining that Rust chooses a different standard. That’s the thing though - every language can define whatever metrics they want and languages like C/C++ struggle to define any metrics that they win vs Rust.

> For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures

This does not follow. Without any mutation of any kind, you can’t even allocate memory in the first place (how do you think a memory allocator works?). And you can totally get memory leaks without mutation however you narrowly define it because nothing prevents you from having a long-lived reference that you don’t release as soon as possible. That’s why memory leaks are still a thing in Java because there’s technically a live reference to the memory. No cycles or mutations needed.

> So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?

Again, Box::leak is 100% safe and requires no unsafe at all. Same with std::mem::forget. But even if you exclude APIs like that that intentionally just forget about the value, again nothing stops you from retaining a reference forever in some global to keep it alive.

Yes, thanks, I read the article. Nevertheless, it's still a surprising footgun.
Unsafe is concerned with unsafe blocks. NonZero::new_unchecked requires unsafe even though it’s not concerned with mem safety.
I believe the optimizer will do optimizations in response to the NonZero which can trigger UB if it does contain a 0, which is a traditional safety issue for Rust which can have no UB in safe code. But even the value being corrupt (ie NonZero returning 0) can cause memory safety issues. But yes, Rust also uses unsafe to bypass enforcing invariants, which std::mem::forget isn’t.
What's unsafe about implicitly "leaking" memory?
Running out of memory and killing the OS I would guess unless the OS kills misbehaving process first.
So pretty much completely safe, then?
Destructors do more than just free memory.
I've specifically said leaking memory? Then again, destructor that didn't run would be an application level error not "safety in rust" error.
You were responding to my comment, which had scope broader than just leaking memory. So, to suggest it is only about leaking memory is not really responsive.