Hacker News new | ask | show | jobs
by benreesman 1423 days ago
I find it so weird that the Rust community is borderline evangelical about memory safety when a) it's not actually memory safe once you start doing heavy shit b) modern C++ is quite memory safe and c) there are so many other great reasons to like Rust.

Memory safety in serious systems software is something that you approach asymptotically and/or probabilistically. Rust makes it easier to be memory safe in a lot of scenarios, at the cost of the father-knows-best borrow checker, but a crashed program is a crashed program whether I dereferenced a null pointer or was poking around in a slice with multi-byte Unicode characters in it. And that's before you get to `rg unsafe` on your favorite industrial-strength Rust codebase.

Rust is cool for so many great reasons that get talked about so little because everyone seems too busy acting superior about memory safety. Talk about traits! Or Cargo! Or the cool async stuff! Anything but another lecture on memory safety.

6 comments

> a crashed program is a crashed program whether I dereferenced a null pointer or was poking around in a slice with multi-byte Unicode characters in it

They aren't the same thing though, that's the point.

Dereferencing a NULL pointer isn't guaranteed to crash. In fact if you are writing through the pointer, you may even have a security issue on your hands (rce, etc).

Safe Rust may have runtime errors that "crash" the program but this is a controlled, well-defined termination, and there is no way for the execution state itself to be corrupted like in C++.

Yep, my favorite part of heap/stack corruption is not when it crashes immediately but rather when it rears its head 2-3 weeks/months later when some upstream call pattern or timing has changed.

I've spend weeks chasing down single instances of this on multiple projects. The nasty part is you have no predictability in if it's going to be one that crashes immediately, silently writes garbage(hopefully not to disk!), is a latent lurking crash or security vuln.

If you trash the stack then there's a good chance you lose the backtrace as well which can make a hard to debug issue become "find the needle in the haystack". I hope it's something that reproduces quickly and consistently because otherwise you're in for a ride.

Yeah, this just isn't how it is anymore. The last time I was up shit creek because 50k boxes were crash looping and GDB couldn't get me a stack trace was in 2014. The last time I spent more than 30 minutes chasing a memory corruption issue was in like, 2018. And it was because some wise ass had decided to roll his own fibers by stomping on `rip`, `rbp`, and `rsp`.

These days you use `std::unique_ptr`, build with clang-tidy, CI under ASAN, and it's never an issue in practice. Once in a blue moon the CI chirps an ASAN failure that gives you the entire history of the memory address with line numbers and you fix the typo.

The safety that Rust gives me is that it's more expressive type system and modern affordances for things like exhaustive pattern matching lets me avoid logic errors, which are every bit as deadly as buffer overruns and much harder to mechanically identify.

It is usually easier to write correct code in Rust than in C++ because it's much more modern and frankly kind of an everyman's Haskell (which I mean as a compliment). But it's intellectually dishonest to say that this doesn't come at a cost: when you wander out of the borrow checker's sweet spot it can become kind of a Tetris puzzle even when you know all the rules on paper.

The same pattern matching that lets people see a borrow checker puzzle and immediately say "right, we need to do X" is the pattern matching that let's a C++ hacker see a failed template instantiation and immediately know what got misspelled.

A sibling comment suggests this has more to do with where you work than how modern your C++ is, which rings true to me. Different kinds of programs need different kinds of memory management patterns, and some are more error-prone than others.

In my experience there also tends to be a long tail of memory corruption bugs. After flushing out those that are easy to run into or that have a major impact, everything seems fine and you can go years without really spending time on them, but they're still lurking at the edges of automated crash reports and mysterious bug reports you never quite manage to reproduce yourself. And when I do manage to track one down, it's as likely as not to be in, around, or even caused by modern C++ features.

Tetris puzzle or not, it's really quite nice to systematically rule out those kinds of issues. In some domains it may not be worth it, but in others they can hide major security issues or similar. And either way it sure beats periodically digging through crash dumps trying to piece together something that looks impossible from the surrounding source code.

If you need extreme robustness you have to have coverage and fuzzing and canaries and stuff for logic bugs as well as memory bugs. If you’ve got a long tail of non-exercised code paths, a “<“ flipped with a “>” will fuck up your day just as bad as a use-after-free.

If your code is covered, ASAN will red-zone the memory bug. It checks every address.

People are welcome to their subjective opinions about the “easiest” way to get truly correct software (which almost no one needs), but the oft-repeated assertion/implication that the tools don’t exist to do it outside of Rust/Go is wrong. Not a subjective opinion, demonstrably incorrect.

And when enough truly important shit is written in Rust, which will be soon, there will be CVEs. Many of them.

Well, yeah, if you're reaching for that level of robustness you want every tool you can get. If you can get rid of a whole category of bugs with one tool, that only makes the other tools more effective for the rest!

(There are also cases where that extra robustness is more of a "nice to have," so if you can get a side effect of your approach to something more important, that changes the calculus too.)

> a “<“ flipped with a “>” will fuck up your day just as bad as a use-after-free

One is categorically worse than the other though.

I deal with issues like this about once a month. It may not happen where you work, but it definitely still happens.

If it really never happens where you work, consider yourself lucky.

Is every diff thoroughly reviewed? Is everything built with `-Wall -Wpedantic -Werror`, `clang-tidy` with most checks on, ASAN/TSAN/MSAN/UBSAN on every commit in the CI, and aggressively canaried against replay data (or whatever is appropriate to the domain to exercise all the paths)? Is all the code run through `clang-format` in a pre-commit hook to lower the cognitive overhead of spotting bugs?

I completely understand that when you turn all the checks up to maximum (which, in fairness, `rustc` does by default) you start with as many errors as you have files if you're lucky, and probably 10x that. I've had to take codebases from working by accident on every 10th line to passing all the static analysis cheaper than PVS-Studio, and it's a bear no doubt. But codebases that are `clang -Werror` clean, `clang-tidy` and `cppcheck` clean, ASAN/MSAN/UBSAN clean, and have all this enforced by CI?

I haven't seen those codebases thrash the core dump where GDB prints out a bunch of "????????????" instead of addresses with any frequency.

Someone should do a 2022-edition "Joel Test" (https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...) because I think we're all using revision control now, times change, but until someone does, I'm happy to trade war stories about getting messy code bases / development workflows into fighting form.

We still develop on and support platforms where we use the vendor compilers (which don't have many of those modern features).
It has been years since I spent any time chasing down memory usage errors.

I recommend compiling with warnings turned on, and acting on them.

> a) it's not actually memory safe once you start doing heavy shit b) modern C++ is quite memory safe

This just doesn't capture the problem that memory safety solves. A crashed program is not the worst-case scenario that it's trying to avoid. Even the most memory-safe language supports exiting early with an error message, or whatever.

In terms of language semantics, there is an all-or-nothing line between memory safety and undefined behavior. A memory safe program does what it says, locally, step-by-step, according to the semantics of the language. When a program exhibits UB, those guarantees are lost.

Of course, as you note, unsafe Rust also lets you violate memory safety, and in fact any memory safe language is at the mercy of its implementation and host. The reason people get evangelical about Rust's memory safety is one level higher: it offers a bridge back to memory safety, such that unsafe code stands on the same footing as the core language. When either are bug-free, the compiler can ensure they are used correctly, using the same type system features for both.

Modern C++ is certainly much less error-prone than the bad old days of manual `new` and `delete`, but it doesn't have an answer to this "unsafe encapsulation." To the contrary, modern C++ actually adds a bunch of new ways to violate memory safety by misusing library APIs. Iterator invalidation, use-after-move, string_view and span and borrowed ranges, by-reference lambda and coroutine captures, etc.

This all means that "serious systems software" in C++ has to approach memory safety via defensive copying or refcounting, copious use of sanitizers, and sandboxed sub-processes. Meanwhile, Rust programs can do things that would be unthinkable in a large C++ codebase, because the assumptions of both the language and unsafe code are encoded in the type system. (For example: https://manishearth.github.io/blog/2015/05/03/where-rust-rea...) It's a qualitatively different solution to the problem.

I think memory safety is the killer feature of rust, and has become so because people see the real world problem it's solving, more than through evangelicalism. We'll see in a few years when more "heavy shit" has been written/rewritten in rust. My prediction is that they will have significantly fewer memory safety issues than comparable c++ "heavy shit".
Business also needs to care,

> Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.

-- C.A.R Hoare on his Turing award speech in 1981.

From where I sit the killer feature of Rust is that a bunch of amazingly cool software is written in it, especially in the terminal. I'm a big terminal guy, and I can't think off the top of my head of anything I use constantly that isn't written in Rust. `rg`, `fzf`, `zoxide`, `bat`, `viddy`, the list goes on and on, I fucking love the shit people are writing in Rust.

And I think that should be the killer feature of a language: that cool software is written in it and is continuing to be written in it. This is a killer feature shared by Rust and C++ and these days to be serious about performant software in diverse settings, you pretty much have to know both well.

Terminal programs are one area where Rust's strengths seem to align (very good CLI libraries/parsing, error management, and concurrency) and weaknesses are less relevant (async, GUIs), which might be why it seems to be gaining traction in that area.
I'm embarrassed enough about misattributing like 2/5 to Rust that are written in golang, so I just dumped my minimum (non-work) `home.nix` package list: https://gist.github.com/johnnystackone/bacd9275296f3d5d0cd75....

I'm not going to go through every one and remember/look up which ones are written in Rust, but I'll wager that half-ish of them are.

Thanks for that list. I'd heard of rg and fzf but not the others.

I immediately thought: well what about Go for command line tools? Is this the viddy you speak of? https://github.com/sachaos/viddy If so, looks like it is written in Go. Looks like fzf too.

btw, fzf is written in go ^^
> We'll see in a few years when more "heavy shit" has been written/rewritten in rust.

I'd really like to see that. Would be cool to see a completely new Linux user space written in Rust. Not necessarily a rewrite of existing software, new ideas would be great. I tested Linux system calls and they worked very well even though they needed experimental inline assembly functionality to work. With system call support, anything is possible.

Once upon a time there was a project to do a Linux distribution in Ada.

Unfortunately it died a couple of years later.

Slowly there's more and more implementations emerging of Linux userland utils in Rust.

It's taking a while.

>a crashed program is a crashed program whether I dereferenced a null pointer or was poking around in a slice with multi-byte Unicode characters in it

Most of the biggest advances in software engineering are because of increased modularity. One of the best traditional ways to increase modularity is the ability to define and call functions. But any isolation between these "function" modules is only possible if you can at least factor out things into a function mechanically without introducing crashes (for example because of memory unsafety--modularity would fly out of the window right there).

>Rust is cool for so many great reasons that get talked about so little because everyone seems too busy acting superior about memory safety. Talk about traits! Or Cargo! Or the cool async stuff! Anything but another lecture on memory safety.

It's better not to dilute the message. All these other things are nice-to-have gimmicks. But the memory safety is a game-changer. It does no good to advertise 230 features at the same time. No one will remember. Advertise the killer feature. And that's the lifetime stuff, which gives you memory AND THREAD safety.

Rust's thread safety only applies to the special case of those threads accessing in process data segments.

Rust's type system can do very little to help when those threads are accessing the same record on a database without transactions, OS IPC on shared memory, manipulating files without locks, handling hardware signals, handling duplicate RPC calls,...

Yeah but that ultimately requires an unsafe block, kind of true, except no one reads the code of all crates they depend on, and the direct dependencies might be safe in what concerns the direct consumers.

This is a point that you constantly bring up in these threads, do you think most developers believe that data race safety should extend beyond the bounds of the process?

One thing that Rust’s type system does allow you to do is define a consistent manner in which to access external systems, even add types that will mimic the same safety. Is it perfect? Will it protect you from a different process working against the DB? Will it enforce things in the other process? No. But will it give you higher level semantics to be able to construct a better model for operating against that external system? Yes.

Yes, when they care about data consistency in distributed systems.

Maybe many Rust devs don't care.

So what is your point? You’re changing the goal posts on safety and it’s a pointless reductive argument.

Even if we account for everything, solar storms will eventually flip bits unexpectedly. Does that make Rust’s guarantees worthless?

The goal posts stay on the same place. There are ways towards data corruption where Rust's fearsome concurrency is of no help.
This has been a recurring theme from you, but in the cases you're describing the risk is only a race condition (no general solution is possible) and not a data race (which safe Rust is able to deny by design). These are categorically different problems.
Because it is a recurring theme to ignore the other kind of race scenarios when promoting Rust's type system.
People have to consider race conditions anyway, they're part of our world. For example if you use git's ordinary --force to overwrite certain changes that's subject to a TOCTOU race, which is why force-with-lease exists. Even in the real world, I once opened a bedroom window to throw a spider out onto the garden below and a different spider came in through the window in the brief interval while it was open - exploiting the "open window to throw out spider" race opportunity.

Data race isn't just "Oh it's just a race condition" or Rust wouldn't care, data races destroy Sequential Consistency. And humans need sequential consistency to reason about the behaviour of non-trivial programs. So, without this writing concurrent software is a nightmare. Hence, "fearless concurrency".

You won't destroy sequential consistency by having non-transactional SQL queries. Try it.

I have tried plenty of times, and seen not so happy train travelers with the same ticket for the same place on the same train, hence why bring it up all the time.

It is obvious it is a subject that is irrelevant in the Rust community.

Who needs consistency in distributed systems when multiple threads from the same process are accessing the same external data without coordination.

Is Rust really "fearless concurrency" ?

Considering the number of deadlock issues encountered by folks using async Rust, I think "fearless concurrency" is misinformation by Rust evangelists.

Deadlocks in production Rust microservice:

https://blog.polybdenum.com/2022/07/24/fixing-the-next-thous...

Rust isn't a startup business or (one hopes) a religion, or an MLM, or a home for sale. It's a useful tool among many.

Why do people say things like: "It's better not to dilute the message"? Better for who? That's sales/marketing language, not engineering language.

"The message"? Pardon my Francais, but WTF?

>Why do people say things like: "It's better not to dilute the message"?

>Better for who?

Better for everyone.

When talking about a new thing, it would be really silly to emphasize how nice the logo is, how nice the package it comes in is, look at the awesome tape the box is closed with etc. If I turn the product off it even turns off! Look at the nice rounded corners of the device!

It even can do async! Just like Javascript and .NET.

Who cares!

What is the main strength of the tool, the pain point it was made to eliminate? Lead with that.

> That's sales/marketing language, not engineering language.

Leading with the actual technical novelty that actually advances the state of the art in production compilers is marketing? Well, I guess it's good marketing in a way.

The user will find cargo on their own in 5 minutes.

I mean "message" how you want, HN is hosted in a free country.

But N=1 for you: as a serious polyglot user of Rust who knows it well and uses it all the time: this shit is a huge turnoff. It's a programming language. On a long enough timeline all the motivated hackers will end up knowing many programming languages well, they all have pros and cons.

Trying to boil important engineering decisions down to a tweet so that we can stay "on message" comes off like something someone would do if they were selling books or training or consulting services attached to a technology, which a priori gives them an agenda other than giving good advice.

So to keep it short: help people pick the right tool for the job without an agenda.

I think you have a point. Rust is primarily focused on being a systems language, and memory safety is the killer feature it brings to the table in that domain. But we know that Rust is being used in areas where its qualities as a systems language are less important.

Why, for example, would a Python developer pick up Rust? Probably because of the really strict typing addressing a major pain point for most Python developers and the trait system being somewhat analogous to Protocols, which any Python developer who has chafed with the dynamic typing is almost certainly already familiar with. With good library support for interfacing between the two, it's a more natural coupling than most people would think on the face of it.

That said, while I don't think a Python developer reaches for Rust because of memory safety, I do think it's still an important factor as it provides the guard rails that make it so someone who has primarily used a GC language and not had to concern themselves as much with managing memory can start using Rust knowing that the compiler is not going to let them accidentally shoot them in the foot when it comes to memory management.

Rust is a programming language. It’s the right tool for the job of programming. It happens to have a lot of features that make it a very good programming language.

I think you’re wrong about the language and community, though. It’s killer feature is it’s safety, be it memory, data race, or type. These are the reasons I was interested in learning the language. The fact that the tools make that easier is why I was able to struggle through the new concepts and actually be able to build useful things with it.

If the fact that people enjoy something as a general community turns you off, that’s not the community’s problem.

Even if rust wasn’t memory safe, being C++ with traits and ADTs would be enough for me to use it. Those are important safety features!
rustc rejects the resulting program if you mechanically factor out a function accessing &mut self, into a function holding mutable borrows to half the fields calling another function which access the other half of fields (or vice versa, the caller holding &field calling a method mutating other fields). This requires the more complex transformation of passing individual fields into the subfunction (more work, but sometimes easier to read), or waiting for Rust to add partial borrows. Note I've never actually hit this case myself, though I've heard it's an issue people run into.
The third option is to use a wrapper type with internal mutability, but that is to be done very sparingly.
Rust is "cool", lately. But it lacks basic features that enable capturing important semantics in libraries.

So the tradeoff is not relative safety against a little compile-time inconvenience. The tradeoff is against "no, you cannot code that thing at all, suck it".

So almost all discussion of relative safety (which Rust advocates would like us to think is absolute) carefully sidesteps the point that there is a very great deal that cannot be expressed in Rust at all -- and not because expressing those things would have been at all unsafe.

i honestly don't think i've seen "cool async stuff" as a substring before.