Hacker News new | ask | show | jobs
Rust – Compile Time Memory Safety (kkimdev.github.io)
175 points by kkimdev 2615 days ago
6 comments

The author says that this is "why Rust is interesting". I've done a little bit with Rust -- no production code, but some simulations and things like that. While compile time memory safety was the first thing to interest me in Rust, I don't even think it's the primary selling point anymore. The selling point to me is that nearly everything behaves in predictable ways.

E.g. since immutable references are the default, you immediately know whether or not a call will modify its argument. E.g. the existence of traits and compiler warnings for style mean that method naming conventions are incentivized, because implementing a trait usually gives you a lot for free. Even performance characteristics are very predictable, a statement which I can only otherwise make about C and assembly (and those are not predictable at all in other ways).

Yes, the memory safety does play into this. It's much more than that, though: memory safety is just one factor in a kind of pervasive predictability that the language encourages through its very design.

The biggest selling point for Rust is a non-obvious one and one that is hard to "sell" with screenshots: refactoring. Despite having few tools for automatic refactoring, manual refactoring of large Rust codebases is a breeze: change the portion of code you care about, follow the compiler's complaints and by the end you're likely to have a codebase in a good state. I wouldn't be able to go back to a dynamic language after using Rust for so long, I'd be in constant panic about changing the smallest thing. I equate it to the same discomfort I feel when getting on a car with no seatbelts. I'm just waiting on the day when I can plug the change I want to do to rustfix and let it figure it out, with the help from the compiler.
> refactoring of large Rust codebases is a breeze: change the portion of code you care about, follow the compiler's complaints

At my $DAYJOB I write a lot of Java, and modern full-featured IDE's for Java like IntelliJ are absolutely awersome for refactoring. Most common refactorings are fully automated, and many more complex things are very easy to do by combinding some of the base refactorings.

I really like writing Rust in side projects, but after getting used to the phenomenal support refactoring support in Java, having to follow the compilers diagnostics feels very primitive.

Jetbrains' refactoring tools have always been top tier. For years resharper was absolutely essential for C# development because visual studio just lacked good refactoring. VS has been catching up and has mostly caught up in recent editions but I doubt that would have happened if not for MS being able to see just how popular resharper is.

I'd hope that as Rust grows in popularity the tooling will come from that so I'd keep an eye on the intelliJ rust plugin or rider support for rust because as you say, it's really difficult to go from a day job where for example promoting a local variable to a parameter (with replacement at the call sites) just works to one where you actually have to think about manual refactoring.

One reason why I'm optimistic about Rust is because the team not only explicitly recognizes the need for advanced tooling (such as IDEs), but owns critical parts of the underlying infrastructure, such as RLS. If you want a language to have quality tooling, it needs to be designed for that, and there's no better forcing function for that than having to write that tooling yourself, or at least having someone who does on the same team, in constant communication.
> I'd hope that as Rust grows in popularity the tooling will come from that ...

Patches are welcome! https://areweideyet.com/

The Rust plug-in, is already quite nice, and getting better all the time. I use it from CLion, since that has integrated debuging support.
You are mixing "fancy tools for mature language" with core capabilities of another (that will enable even better fancy refactoring tools in future - tools that can make more complex refactoring tasks fully automated). If anything, the fact that what used to be an external tool has migrated to core feature in Rust is a testament to how forward looking the language is. Imagine what that will enable in tools to come!
I'm not implying that Rust should have these yet; I am fully aware that the situation for Java is different due to serious and long efforts in developing great refactoring tools.

The point I was trying to make is that the view that Rust is great for refactoring depends heavily on what one is used to. For example, as compared to any dynamically typed language, a strongly typed language will be better for refactoring.

Writing refactorings that work reliably is very complex, and I'm not sure if it is practically possible to write the same kinds of refactorings that Java has for Rust. Two of my favourite refactorings in IntelliJ are extract method (with automatic duplication finding) and inline method/expression. I think that such refactorings are a lot more complex to do reliably in Rust due to ownership and borrowing rules, unnameable types, and so on.

I fully expect rustfix to gain refactoring capabilities, even if haphazardly bolted on. Things like changing an owned field to become a borrow requires a lot of trivial changes that can be automated away (other than making sure the field been borrowed has an appropriate lifetime, it course) that for the most part rustc already emits as structured suggestions that other tools can use.
IntelliJ plugin for Rust can do some refactoring (renaming functions and variables - it's all I use and remember).
For Rust development I use CLion with the Rust plugin, and it is getting better and better all the time.

It is interesting seeing the difference in peoples expectations on what an IDE can and should be able to do for you depending on what they are used to working with.

I'm really glad to see that there is a huge resurgence in support for statically typed languages in mainstream programming. I always felt that this was somewhat inevitable, but the tide of opinion was against it for a while. I completely agree, and the inverse situation is true too. If you're used to statically typed languages, working with a dynamically typed language can be hellish if your codebase is suitably complex! And by suitably complex I mean more than >500LOC. ( That's not a typo, the omission of the 'K' is intentional. )
I'd love your thoughts on this, my guess is it's because of ergonomics improvements. Specifically, inferred types. This middle-ground lets you have much of the ergonomics of a dynamic language but with the safety of a static one. You may have to hint it from time to time, but for the most part (99%+) Rust guesses the types of everything I do day to day, and when it doesn't know, it ... asks.
Inferred typing is just a feature of statically typed languages. Static typing just refers to the type of a variable being known at compile-time. Type inference is the ability of the compiler to infer the variable's type from the context of its declaration. It doesn't mean that there's any loss of safety at all, it's just a matter of syntax.
Yep, exactly; I was suggesting that the move in modern languages towards inferred typing is what makes statically typed languages pleasant to work in, which has lead to their return to popularity. Sorry if I wasn't clear!
In some sense. I mean, type inference is not new. But the languages that really brought static typing to the masses didn’t use it, or only did deduction, so it is “new” to a lot of programmers.

I do think that there’s a relationship between power and usefulness, but more power isn’t always better. I’d rather use Ruby than Java 1.5, but I’d rather use Rust than Ruby. YMMV.

It's kind of sad to me that (almost) complete type inference has been around since the [70s](https://en.wikipedia.org/wiki/Hindley–Milner_type_system) and it's still not mainstream. (unless you consider F# or OCaml mainstream languages)
“change the portion of code you care about, follow the compiler's complaints and by the end you're likely to have a codebase in a good state.”

That has always been my refactoring algorithm with C++, C# of TypeScript now. I know a lot of smart people using dynamic languages but I always prefer to lean on the compiler to tell me what needs to be changed.

> I wouldn't be able to go back to a dynamic language after using Rust for so long, I'd be in constant panic about changing the smallest thing.

As someone who has been working mainly in Rust for about a year and is now starting to pick up Python, this is huge! I pretty much never write Python that isn't explicitly typed thanks to how Rust changed me as a programmer.

Same. I think Rust has been the biggest thing for me in terms of growth as an engineer in recent memory, and I've worked across the stack from HDLs to embedded to mobile to desktop to server over the course of my career.
I think the question was what the biggest selling point is for those who already do e.g. C++, not for those using dynamic languages like Python.
Both questions are important, though. The Rust community is in a very compelling position, in being able to appeal to both the C/C++/'system programming' and the Python/Ruby/etc/'high-level application programming' dev communities.
What I was trying to say was that ease of rafactoring compared to a dynamic language like Python isn't omitted because it's "non-obvious". Rather, it's in fact quite obvious, and simply omitted because it's answering a different question than the one that was asked. Robust refactoring isn't Rust-specific or even new. It's always been the selling point of many statically typed languages, like C++, D, Java, etc.
I mentioned Python because it sits in the other extreme, but the combination of pattern matching, strong types, type inference and ergonomic combinators on the standard library makes the experience much nicer than, for example, Scala. Robust refactoring isn't new. Rust isn't novel. It's just well put together because it had the benefit of hindsight and being able to consider how different advanced features fit together without having to abide by backwards compatibility until recently. Refactoring a single threaded process that operates over a vector can be a single line code change, and the compiler is capable of complaining about data races by leveraging the type system and lifetime analysis.
I think you'd be surprised, Rust has seen a lot of adoption & interest from the dynamic languages crowd.
Go and Rust killed my interest in writing Python code. A paradigm shift worth losing amazing libraries and frameworks like SQLalchemy and Django is rare, but I believe we are here (and have been for a while.)
> Go and Rust killed my interest in writing Python code.

Strange to see Go suddenly brought in to support Rust like this. Was it Go that killed said interest or Rust? Why is Go relevant here?

This is one of the big reasons I use TypeScript over plain JS as well, especially when combined with TSX for typesafe views.
Actually I got into rust because it is the only language that has both ADT/patten matching and a practical mindset/community. There is simply no alternative. I would even code front-end rust for the sake of this.
Reasonml also has adt and pattern matching
True, and swift. They are nice to work with too.
E.g. since immutable references are the default, you immediately know whether or not a call will modify its argument.

You do not, since the type of which an object is referenced could use interior mutability (e.g. RefCell). So, you only know if this is the case after inspecting the type.

Of course, the default in Rust is to use exterior mutability and in general, interior mutability is/should be avoided.

(I agree with the general thrust of your comment though.)

Rust's mutable/immutable is better thought of as exclusive/shared. But in any case, the interior mutability is not causing problems, because the compiler still largely enforces it's used correctly (e.g. no matter how deeply RefCell is hidden, code using it won't be able to share it with another thread, so you don't need to worry about getting a data race this way).
I fully agree with you, and actually I want to write a summary of those less-discussed features that have high productivity impact including the predictability you mentioned in the future.

Though, for the people who haven't explored Rust yet, I still think that focusing on the memory safety, the most powerful feature, is a good approach. Personally I tried explaining other smaller benefits first, e.g., immutable by default, move by default, no header files, but didn't work well as I thought. Exploring another language is a significant investment, and people need a significant reason (at least those that appear to be at first glance).

And there is a small marketing problem for Rust. The memory management is the big ticket thing we like to show off, but the big benefits of the language are much smaller and boil down to quality of life and composability.
Yes, banging on about memory safety sells Rust short, and drives away exactly the people who would benefit most from improved memory safety.

The language is just overwhelmingly better to code in than C, or Java, or C#, or Go. If the compiler were to be made fast -- and there is nothing like a fast compiler coded in your language to advertise its speed (and the reverse!) -- or anyway JITted, with a REPL, it could replace a great deal of scripting.

The comparison to C++ is much less compelling. Rust fans like to lump C and C++ together, but in modern C++ there are few temptations to memory unsafety. (They might be misled by crufty Mozilla code.) Meanwhile, the greater expressivity of C++ enables more powerful libraries, and each use of a good library eliminates many more than just memory bugs.

I agree with you, for sure. We had a huge community discussion about this in 2016: https://brson.github.io/fireflowers/
Rust has a great type system that enables the kinds of things you're talking about. It's like c++ and haskell came together and had a baby.
> Rust enforces single mutable ownership or multiple readonly aliases at a time. In fact, they are very good idioms to structure large codebase anyways, and normally they do not get in the way for ordinary applications.

No these limitations routinely get in the way for ordinary applications. The borrow checker is a source of frustration when ramping. Back-references get smuggled in as array indexes. Prohibiting global variables is tough. Any sort of app that can't be structured as a tree is going to have pain.

This safety is really valuable, but let's not pretend it comes for free.

The author is somewhat mistaken there - what Rust actually enforces is a clear alternative of exclusive ownership/borrowing, or shared access with multiple aliases being active at the same time. While these are normally identified with "mutable" vs. "readonly" access, this is not true in some cases, where special structures with "interior mutability" can be provided with different behavior. For example, if you need to share writable access to a piece of data, you can use the "Cell<>" or "RefCell<>" generic types. For an object which needs to have multiple "owners", each of which can extend its lifetime and prevent the object from being freed, there is the Rc<> type, etc. This stuff may not come for free, but quite often its cost can be made very reasonable while preserving desirable safety properties.
The performance cost of interior mutability is often small. IMO the real cost is the undesirable safety properties.

For example consider an API like JS's getElementById(). In Rust, if a caller frame has a reference to the same element, this would just panic. It's impossible to statically enforce that no caller can have a reference to this element, and it's unreasonable to require it at runtime. So you either give up safety guarantees (viable, e.g. gtk-rs, but it leaves Rust anemic) or you give up the entire programming model (maybe viable, still a research project).

getElementById() is perfectly allowable under Rust's rules, though it may not look like what you expect (and to be fair Rust doesn't make it easy to write today).

The most straightforward way to get it working would be to return a `&Element` (or `Rc<Element>` if you like), and make all of `Element`'s fields `Cell`s. No panics, no runtime checks, and you can do everything JS can do. The cost is "infecting" the type definitions with `Cell<T>` and the usage with `.get`/`.set`, and the loss of what is normally rich pointer aliasing information for the optimizer (but which other languages don't have to begin with).

The reason Rust `&mut` feels so restrictive is that it allows you to change the "shape" of the object, thus invalidating (if they existed) any other references into it- replace an enum value with a new variant, reallocate a `Vec`, overwrite a `Box`, etc. But in other memory-safe languages you can't do any of those things. Instead any "shape changing" is done by allocating a new GC'd value and overwriting a pointer- enums are boxed (or don't allow interior pointers), arrays only contain primitives or pointers, etc.

So I like to think of `&Cell<T>` as a kind of third reference mode, that matches what people expect from other languages. It's not fun to use today, but there are a couple of language additions that could make it much, much nicer:

* First, field projection- given a type `struct S { x: T }` and a value `r: &Cell<S>`, let `r.x: Cell<T>`. This is safe as you can't invalidate `&r.x` by overwriting `r`- but by extension you can't project a reference through a `Cell` into an enum or `Vec` (just like other languages as described above).

* Second, some syntactic sugar for reading and writing. Replace `cell.get()` and `cell.set(x)` with `cell` and `cell = x`. Given that `Cell` has zero overhead (other than the loss of optimizations described above) this shouldn't be an issue.

The more idiomatic way would be to return an Rc<RefCell<Element>>.

Accessing it will indeed panic if someone else has an active mutable reference to the contents of the refcell, but the idea is that you should only keep that for the section of code that actually modifies the object.

The reason this feature exists is that it prevents code observing partially modified objects that could be temporarily missing required invariants (in addition to preventing mutating object with references to parts of them which can result in dangling pointers and thus memory unsafety).

You can also use Cell as the parent argues with the benefit of never failing, but then you don't get protection from recursive calls exposing violated invariants and you need to change the implementation of Element itself.

To me it was the opposite: it gave me a vocabulary and taught me how to think about these problems.

Shared mutable state and ownership exists in C, but I just don't get any compiler support for it. I can't even document it in code, so I (and users of my libraries) rely on RTFM.

In C I'd just "wing it", and tweak the code until it stops crashing. Maybe add a flag with "obj.free_data_ptr = true" and keep adding mutexes or copies of data where I suspect it's necessary.

In Rust I get predefined templates for this — owns & borrows, cells/atomics, refcouted and mutex containers, etc. The compiler says "nope, this is wrong!" and I get to conciously decide how to solve it — do I share or copy the data? Is the sharing dynamic, or just in a wrong scope? And my decisions are documented in code, and enforced by the compiler.

Isn't the example with re-using a moved c++ vector completely safe?

    std::vector<int> x = {1, 2, 3};
    process(std::move(x));
    x.push_back(4);
    // Runtime invalid memory access
Oh, I didn't know about this one. I did some search, and looks like it should be "valid but unspecified state" after move?

I can't think of any case that relying on unspecified state is desirable even if it's valid, though I guess it's better if I change that to x.pop_back(); to be clear.

Please let me know if my understanding is incorrect and thanks for the information!

The only thing 'unspecified' guarantees you in this context is 'safe to destroy'. It specifically does not guarantee the safety of any other member functions - only the destructor. So either push_back() or pop_back() could potentially cause UB here (specifically, it's quite possible that the move swaps some internal pointers for nullptr, so you end up dereferencing null here - but morally, it's just never okay to continue using a moved-from object).
The vector is "valid" and that is what carries all the weight here. The vector is still a vector.

push_back is absolutely defined. pop_back might be undefined, because pop_back is UB on an empty vector. If you like, call clear, and be assured of an empty, reusable vector. It's not idiomatic, but it's safe.

A moved-from std::vector<int> will always be empty. However, a moved-from std::vector<int, custom_stateful_allocator> may not be.

Howard Hinnant had a Stack Overflow reply a while back going through the possible corner cases of this precise question: https://stackoverflow.com/a/17735913

Yes, it's completely safe and its rust equivalent is

    let x = vec![1, 2, 3];
    process(std::mem::replace(&mut x, Vec::new()));
    x.push(4);
https://doc.rust-lang.org/std/mem/fn.replace.html
There's no guarantee that std::move moves elements out of x. (and thus no guarantee that it's equivalent to a Rust mem::replace with a new Vec) The only requirement the C++ standard imposes is that the vector is left in a "valid but unspecified state".
It’s runtime safe (I’m taking your word for it). In Rust this would be a compile time error.

Although, are any C++ compilers able to at least issue a warning in this case? It wouldn’t surprise me if they could.

Move constructors are allowed to call the moved object's destructor.

Not necessarily like the example given, but

  Object a {std::move(x)};
  x.push_back(4);
could/should segfault.
No, move constructors are never allowed to call the moved-from argument's destructor. Ever.

Sometimes the compiler calls that destructor after it has finished the move, if the thing is no longer in scope. That should not be confused with a thing happening in the constructor.

Yes it is safe.

    int x;
    int y = square(x);
    // Passing a garbage value at runtime.
My C++ compiler doesn’t build that:

error C4700: uninitialized local variable 'x' used

That's because you're compiling with warnings-as-errors.

And even then it's still unable to detect anything but the simplest case. E.g. this compiles with no warnings:

   void foo(int& x) {}

   int x;
   foo(x);
   int y = square(x);
> you're compiling with warnings-as-errors.

I'm not. The compiler is VC++ with default project settings. "Treat warnings as errors" setting is set to "No (/WX-)"

By default, VC++ compiler turns on this for new C++ projects: https://docs.microsoft.com/en-us/cpp/build/reference/sdl-ena... That's why it fails to build.

C4700 is a warning per VC++ docs:

https://docs.microsoft.com/en-us/cpp/error-messages/compiler...

So by definition, if it's reported as an error, you're compiling with warnings-as-errors enabled, at least with respect to that one warning.

The default project settings you refer to are Visual Studio defaults (for new projects created from a template), not VC++ defaults. If you invoke cl.exe directly from command line on your code, you'll see C4700 being reported as a warning, but it still produces the binary.

/WX is a switch to treat all warnings as errors, but it's not the only switch that controls warnings-as-errors behavior - you have all the other /W... switches, from level-based approach to setting it for specific warnings. And /sdl is (in addition to all the other things it does) an alias for a bunch of those switches. Indeed, on the very page you linked to, it literally says: "/sdl enables these warnings as errors ... C4700". And you can even override that by doing something like /sdl /wd4700.

This is a really nice way to explain the additional level of safety you get over C++.
Rust implements safe and efficient programming by simulating bank lending practices. This is the first time in the programming language world to reference financial models at the language level, but unfortunately, Rust language designers are not proficient in financial knowledge and have made mistakes in core fundamental issues. The relationship between borrowing and lending is not a buying and selling relationship. It is caused by the transfer of resource use rights, not the transfer of ownership. It causes a series of semantic errors and cannot fully and correctly refer to the financial system knowledge, making its programming too complicated.

The financial system is the safest, most stable, most rigorous, largest and most tried-and-tested system in human history. From ancient times to today, male, female, old, young, wise, stupid, positive, evil, good, evil, With the participation of the whole people in the game, Rust only learned a little bit of fur and took the wrong knowledge, but he also achieved great success.

Therefore, I suggest that computer science should classify financial knowledge as a compulsory course, which has great reference significance for building a safe, efficient and stable system.

from: https://github.com/linpengcheng/PurefunctionPipelineDataflow...

The use of terms like "ownership", "borrowing" etc with respect to object lifetime management long predate Rust. You may not like the way they are defined in this field, but it's a well-established definition. And it is not intended to have much to do with "bank lending practices", other than a very rough analogy that's easier to explain.
Somebody should tell all the people made homeless by the banking crash of 2008 how uncrashable the banking system is.
In order to gain more benefits, the use of unsafe technologies is a risk cost. There is a presence in both the financial and IT sectors.