| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Manishearth 3542 days ago

One interesting thing about Rust is that none of the language features are really new. Even the borrow checker is from research papers and languages from quite a while ago. The only thing I can think of that might be truly unique to Rust is the concurrency safety model (Send+Sync), though that might be old too.

Rust has just managed to take all these features and put them together well, and strive to be more than a research language by working on things that would make others actually use the language.

(Of course, this particular feature is common in many, many languages)

7 comments

dannyobrien 3542 days ago

Yes, worth noting that this was an initial goal of Rust: to use research that was rarely implemented, but established and non-novel among CS academics. See http://tim.dreamwidth.org/1784423.html -- I remember some comment about this being part of the choice of the name, too, but I can't find anything specific or definitive about that.

Manishearth 3542 days ago

The name "Rust" has so many origin stories; IIRC the creator kept changing the story each time he was asked :)

pcwalton 3542 days ago

> Even the borrow checker is from research papers and languages from quite a while ago.

Eh, that's underselling Rust's contributions. Rust is more flexible than anything I know of when it comes to enforcing aliasing-xor-mutability. Cyclone for example was much more restrictive in disallowing aliasing (see Grossman's "Existential Types for Imperative Languages").

The key feature that Rust has is flow-sensitive permissions on unique loan paths, which is actually pretty novel as far as I'm aware.

Manishearth 3542 days ago

I guess, yeah. I meant that what Cyclone does is along the same lines.

> The key feature that Rust has is flow-sensitive permissions on unique loan paths, which is actually pretty novel as far as I'm aware.

Huh, right.

catnaroek 3542 days ago

Indeed. Regions are 20 years old. Substructural types are 30 years old. Type inference is 40 years old. Hierarchical module systems fancier than Rust's are at least 30 years old. With the exception of regions, all this stuff is older than I am, myself. But Rust took all these isolated good ideas and made a practical programming language out of it.

junke 3542 days ago

Ada and Pascal are not really research languages, though. But still, there are improvements taken from research, see Ada 2012 contracts for example.

Manishearth 3542 days ago

Right, like I said, this particular feature exists in a million languages. I'm talking about the whole set of features that Rust has; some are from research languages and may have never been seen in the industry before (e.g. regions/borrow checking), but they're not really new. This particular feature in Rust descended from the ML family (since Rust used to be ML-like), which in turn probably got it from Ada or w/e.

nv-vn 3542 days ago

ML predates Ada by about 7 years, so that's unlikely. The way it is implemented in ML is most likely just an implementation of the mathematical concept of sum types rather than a feature influenced by existing programming languages.

Manishearth 3542 days ago

> ML predates Ada by about 7 years, so that's unlikely.

Ha! I didn't know that. Thanks :)

optimuspaul 3542 days ago

sorry if this is a dumb question, but what classifies a language as a research language?

bluejekyll 3542 days ago

That's a great question!

I would probably restate it though. What is research vs. a product. I just read about 1nm transistors, but we're probably looking at 10 years before Intel, et al, have built all the infrastructure to reliably deliver a CPU based on it.

In the case of a language, I would say it's similar, do you have the support infrastructure in place? Rust is amazing in this regard: Cargo, crates.io, docs.rs, rustup, etc. on top of that you have at least one large company and many others pushing the language in a large and distributed product.

I would classify Rust as a production ready coding platform.

optimuspaul 3542 days ago

Thanks!

It seems very obvious now. I was thinking the purpose of the language was for performing research, but that didn't make much sense. I see now that it's the language itself that is the subject of research, not the tool. English is fun.

grayrest 3542 days ago

There are lots of languages in the world. The majority of them are written by a single person for themselves or (if there's funding involved) a small group of people who deeply understand the compiler. These are either research, hobbyist, or company languages depending on the context of the author. They exist to explore a particular idea or solve a specific problem and don't aspire to be widely used.

If you hear the term used disparagingly it's because they tend to have issues you generally wouldn't want to put up with when you're on the clock: compiler bugs, lack of error messages, large missing pieces in the standard lib, spectacularly bad performance, etc because those weren't the problems the language was meant to explore.

vvanders 3542 days ago

Yup, a lot of people like to say X has Y, however execution of Y is just as important(if not moreso) than the initial idea of Y. Technical ideas don't thrive in isolation, they need to be nurtured and grown.

Just look at the dominance of Javascript in the programming landscape, you're seeing decades of high quality executions of the language(I include community in the "execution" definition) despite it being far from the best technical language.

bandrami 3542 days ago

the concurrency safety model (Send+Sync), though that might be old too

Smalltalk has had that since the early 1980s

dbaupp 3542 days ago

Like steveklabnik, I'm extremely curious to hear more about how Smalltalk's concurrency model is just like Rust's. The latter is a fairly flexible model to defend against data races (which seems to be a concept only really formalised in the late 80s) that puts most of the power in the programmer's hands (i.e. no need for compiler-inserted locks on every object, etc.) that comes from a finely balanced combination of the trait system and the manner in which Rust controls mutation.

It would be great to see how other languages have achieved a similar balance, and my impression is that Smalltalk (and like most languages) does not put nearly as much effort into controlling mutability, but maybe I'm wrong. Could you post some links/a description that explores how Smalltalk achieves a similar level of safety?

nickpsecurity 3542 days ago

"which seems to be a concept only really formalised in the late 80s"

http://brinch-hansen.net/papers/1975a.pdf

http://brinch-hansen.net/papers/

His first, concurrent OS was RC 4000 in 1969. It had many mechanisms in place. He got the key parts of the safety problem figured out by 1972. His language to handle much of it statically at compile-time was done in 1975. His Boss 2 system same year used coroutines + similar concepts to run 100 activities at a time with a proof of deadlock freedom. Finally, he used Concurrent Pascal to implement Solo system which ran its processes safely without using physical, memory protection. These summaries came from "Evolution of Operating Systems" on 2nd link.

So, the stuff was well-established by the mid-70's with operating systems using it in production. Just ignored by mainstream like a lot of good stuff for various reasons. ;)

dbaupp 3542 days ago

I don't have time to read it in full right now, but a quick glance over that paper doesn't show any formalisation of data races, which is specifically what that parenthetical is about, not just general "concurrency safety". Having a detailed description of concept seems important because Rust is fairly precise in what it defends against, possibly giving the programmer more flexibility and power than in a system that attempts to solve many problems (but of course giving them less assistance when writing something that fits within the rules of the more defensive languages). Of course, it's possible that a language "accidentally" solves that problem without it having been formalised, it just seems less likely.

nickpsecurity 3542 days ago

It's in the disk buffer and monitor examples. Here's a relevant quote:

"A disk buÆer is a data structure shared by two concurrent processes. The details of how such a buÆer is constructed are irrelevant to its users. All the processes need to know is that they can send and receive data through it. If they try to operate on the buÆer in any other way it is probably either a programming mistake or an example of tricky programming. In both cases, one would like a compiler to detect such misuse of a shared data structure."

"To make this possible, we must introduce a language construct that will enable a programmer to tell a compiler how a shared data structure can be used by processes. This kind of system component is called a monitor. A monitor can synchronize concurrent processes and transmit data between them. It can also control the order in which competing processes use shared, physical resources."

Sounds like he understands the problem is concurrent processes stepping on each others toes using a shared, data structure. He has nice drawings and charts to go with it. Plus, an early model to solve it statically at compile time. Jweb_Guru noted some limitations of his method but it understands and solves the fundamental problem. Hansen's own work improved on it later plus it was obsoleted by things like Ravenscar, SCOOP and now Rust's method. Nice that he had a whole OS protected from concurrency errors at compile time in the 70's, though. :)

dbaupp 3542 days ago

None of that looks like a formalisation or even a description of a data race, at least not in the modern sense.

gpderetta 3541 days ago

> Just ignored by mainstream like a lot of good stuff for various reasons. ;)

This paper introduced Monitors. It was of course hugely influential.

Mutex and condition variables, the building blocks of monitors made their way into posix threads and from there into C++11.

I believe Java is a much closer realization of Hansen model as every java object is a Monitor.

nickpsecurity 3540 days ago

Good point. I should qualify that statement better. Hansen's model was statically guaranteeing freedom from issues at compile time. There were projects aiming at that which could've implemented a similar model but didn't. More work could've gone into eliminating its limitations, etc. The Ada and Eiffel people carried the torch, though, with interesting progress. Now Rust is. Ada folks aren't resting, though, as ParaSail is pretty neat.

Jweb_Guru 3542 days ago

This system does not support recursion. It also does not have to reason about atomics and so on because it is not intended for multicore systems; indeed, it assumes that all access is serialized through a mutex, and doesn't consider the existence of types like single-threaded reference counters (like Rust's Rc). It is describing threads with exclusive access, and a scheduling mechanism for waiting on locks, but the mechanism is entirely dynamic (and the paper assumes the existence of a virtual machine, which is frequently invoked to deal with tricky cases).

Some relevant lines:

> A Concurrent Pascal compiler will check that the private data of a process only are accessed by that process. It will also check that the data structure of a class or monitor only is accessed by its procedures

is significantly less than what Rust provides, because:

> Processes cannot operate directly on shared data. They can only call monitor procedures that have access to shared data. A monitor procedure is executed as part of a calling process (just like any other procedure).

> If concurrent processes simultaneously call monitor procedures that operate on the same shared data these procedures will be executed strictly one at a time. Otherwise, the results of monitor calls would be unpredictable. This means that the machine must be able to delay processes for short periods of time until it is their turn to execute monitor procedures. We will not be concerned with how this is done, but will just notice that a monitor procedure has exclusive access to shared data while it is being executed.

Rust allows shared access to immutable data and does not require either serialization or the invocation of a virtual machine; it also allows nondeterministic operation to be used as long as it cannot cause memory unsafety.

Additionally, there are other, more basic things that the system is not capable of. For instance:

> (Strictly speaking, a compiler can only check that single monitor calls are made correctly; it cannot check sequences of monitor calls, for example whether a resource is always reserved before it is released. So one can only hope for compile time assurance of partial correctness.)

A major part of what Rust brings to the table is the ability to statically avoid problems like use after free and double free even in a concurrent setting. How would I avoid resource leaks in a system like this without a per-thread allocator (which would likely prevent sending across threads)?

Of course, it's not clear to me that this matters anyway, since I can't destroy a thread or shared data structure!

> Dynamic process deletion will certainly complicate the semantics and implementation of a programming language considerably. And since it appears to be unnecessary for a large class of real-time applications, it seems wise to exclude it altogether. So an operating system written in Concurrent Pascal will consist of a fixed number of processes, monitors, and classes. These components and their data structures will exist forever after system initialization.

I agree that people have a strong tendency to not know what past work has been done, and Concurrent Pascal is admirable, but what Rust does is not the same thing. Rust specifically tackles a lot of hard problems in concurrency that simply do not exist as long as you (1) assume away deallocation for shared data structures, and (2) serialize all access to those data structures.

(There are lots of other specific points I could go into; for instance, Concurrent Pascal apparently lacks facilities for generic programming, so I probably couldn't combine two correct concurrent data structures and expect a third correct concurrent data structure to come out. But that stuff isn't directly related to concurrency).

nickpsecurity 3542 days ago

"I agree that people have a strong tendency to not know what past work has been done, and Concurrent Pascal is admirable, but what Rust does is not the same thing. "

I'm not saying it does. The claim I replied to said preventing things like data races was done till well into the 80's. I showed systematic investigations of the problem plus a production solution had been done by mid-70's. You've nicely showed how far ahead methods like Rust got. ;)

dbaupp 3542 days ago

> The claim I replied to said preventing things like data races was done till well into the 80's

No, it definitely didn't: it said data races as a concept seemed to not be formalised until the late 80's.

stcredzero 3542 days ago

my impression is that Smalltalk (and like most languages) does not put nearly as much effort into controlling mutability

VisualWorks got the ability to make certain objects immutable sometime in the early 2000's. The VM engineers were particularly proud of that one.

stcredzero 3542 days ago

Smalltalk has had that since the early 1980s

That really depends on which Smalltalk you're talking about, I think. Various implementations really had different concurrency models. Some implementations used native OS threads. Others ran synchronously "inside the image" but spawned threads to interact with the OS.

gpderetta 3542 days ago

that's interesting, AFAIK smaltalk is fully dynamic and has no compile-time type checking, so how would it statically enforce the equivalent of Send and Sync constraints?

stcredzero 3542 days ago

In most implementations, Smalltalk is compiled to bytecode, and uses late binding. It's usually run on a JIT.

how would it statically enforce the equivalent of Send and Sync constraints?

In some Smalltalks, normal execution is synchronous. Many of them also use read and write boundaries for various purposes. The former gives you Send and Sync constraints for free. The latter can be used in difficult edge cases. (Like when you're calling out to the OS.)

Manishearth 3541 days ago

> The former gives you Send and Sync constraints for free.

Uh, no, it gives you thread safety for free.

When my comment was talking about Send and Sync I was talking about the specific way Rust's typesystem enforces thread safety. I'm not saying that other languages don't, I'm saying that Rust's method of enforcing it might be one of the few unique things it does

pcwalton 3542 days ago

Smalltalk is dynamically typed. How can it have a static type system feature?

Manishearth 3542 days ago

(Yeah, just to clarify, in my comment above, I'm talking about systems similar to Rust's specific system for enforcing thread safety; not arbitrary systems that enforce thread safety in a different way)

steveklabnik 3542 days ago

I'd like to read more about this, do you have a recommendation of where I can look? I have a basic fluency in Smalltalk, but it's been a while.

Manishearth 3542 days ago

Interesting, thanks!

rurban 3542 days ago

Putting them well together in Rust is debatable.

Writing proper and safe concurrent code in Rust still looks horrible compared to languages which supports it natively in the type system, e.g. ponylang type capabilities. You still have to manually maintain locks, and it's also much slower.

There's a safety model, but it's only best practice, not enforced by the language nor the compiler. So calling it "safe" and "truly unique" is way off. Even parrot has a better, safe and lockless threading model, which guarantees safety.

Manishearth 3542 days ago

> You still have to manually maintain locks, and it's also much slower.

You ... don't. That's just the concurrency model that the abstractions in the stdlib expose, but Rust's safety is generic enough that you can use different abstractions (e.g. lockfree ones). See crossbeam for some alternative models. There are also transactional memeory impls in Rust. They still use Send and Sync for safety though. Manually maintaining locks is a feature of the concurrency library used, not the safety model.

Pony's system is actually pretty close to that of Rust. Sync is "immutable", Send is "isolated" (sort of). Of course, capabilities are different from auto traits, but the idea behind using these two capabilities for concurrency safety is similar.

(So I was wrong that Rust's concurrency safety system is unique, since Pony has something based on the same concepts)

> There's a safety model, but it's only best practice, not enforced by the language nor the compiler.

Yeah, Send and Sync are technically a part of the stdlib (and can be reimplemented outside of it, aside from a small interaction with statics -- Send/Sync are treated as special by the compiler when it comes to statics). However, it is enforced by the compiler in the sense that if you avoid best practice (using Send and Sync), you can't write parallel code without dropping to `unsafe`. If you do that you can design safe abstractions around that and use whatever concurrency enforcement you wish, though pretty much everyone sticks to Send and Sync since it works well with the rest of the language.

(Because of the statics thing, "not part of the language" is debatable, anyway)

gpderetta 3542 days ago

Having a language-imposed concurrency model would be useless and actively harmful in a system language. Rust provides the building blocks to build whatever safe abstractions are appropriate for the problem and domain at hand, plus a set of out of the box abstractions relatively low level that will be familiar to people coming from other system languages.

And of course the means to get rid of any abstraction and safety when required.

edit: accidentally a word

jonathanstrange 3542 days ago

> Having a language-imposed concurrency model would be useless and actively harmful in a system language.

It works fine in Ada.

Edit: Why am I downvoted for this? A language-defined concurrency model is indispensable for having safe and deadlock-free concurrency in the language (rather than in arbitrary utility libraries). That's why concurrency was explicitly included into Ada and into the additional Ravenscar profile. The rationale for this has always convinced me and it also works fine in Ada, so anybody care to elaborate what's wrong with it?

nickpsecurity 3542 days ago

It worked great for Concurrent Pascal and Solo:

http://brinch-hansen.net/papers/

I agree it's better to have mechanisms that can be turned into a proper model. A good default, though, greatly improves consistency and reliability in real-world systems.

rwmj 3542 days ago

Only if development of the language and OS are separate. If you co-develop the language with the OS, then it makes perfect sense to push safety features into the language (or conversely, to remove them from the language when they are no longer appropriate).

LISP-strength macros give you most of this capability.

gpderetta 3542 days ago

OS are not the only use cases for system languages and developing a language to be tightly tied to an os (and viceversa) is a great way to condemn both to obscurity.

amalcon 3542 days ago

Erm... I think Dennis Ritchie would have had a few things to say about that.

gpderetta 3542 days ago

good point; though, while designed to work well with each other, neither unix nor C require the other.

pcwalton 3542 days ago

> There's a safety model, but it's only best practice, not enforced by the language nor the compiler.

What? How is it not enforced?

> You still have to manually maintain locks, and it's also much slower.

Slower? I don't think it can be any faster even in theory!