| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mr_00ff00 1093 days ago
	If unsafe means “safe but the compiler cannot verify” then I guess just consider .cpp to mean “safe but the compiler cannot verify” and we have suddenly made C++ memory safe

5 comments

retrac 1093 days ago

There's a related idea in Haskell, usually considered a memory safe language. You can write a program in Haskell that directly mutates memory, or does IO operations, freely, anywhere in the code. This violates functional purity and the compiler cannot offer its usual promises; your program may very well segfault from a bug in such code. But sometimes you just have to, perhaps to implement an algorithm efficiently.

Still, it is discouraged; both culturally in the language community, and discouraged through the subtle prodding of the language itself (such as everything being typed "IO", or the slightly ominous "unsafe" in the "unsafePerformIO".) Very often, the amount of code that must truly live in IO can be reduced to a few dozen lines, if that. That code is crucial to get right -- it's where the actual sequence of computation and external effects are handled. Such isolation allows the rest of the code to not have to worry about those matters.

jjnoakes 1093 days ago

Sure, and if a typical Rust program that I write has no unsafe in it directly, and 5% of its dependencies' code have unsafe in them, that's also the same as writing a program in the "not c++" language directly, and using "not c++" dependencies for all but 5% of the dependency code.

Seems like a silly analogy to me, though.

mr_00ff00 1093 days ago

Right but it’s that 5% the origin comment is talking about. The times when rust has to use unsafe for the type of program.

jjnoakes 1093 days ago

Unsafe Rust is safer than C++, and even if it wasn't, 5% unsafe in Rust programs (in well-marked locations) is vastly superior to 100% unsafe in C++ programs.

Any analogy that equates the two is silly.

anonymoushn 1093 days ago

unsafe rust is less safe than C++ because of the provenance and aliasing semantics that unsafe rust must adhere to to avoid UB, which are generally tricker than those of C++

tialaramex 1092 days ago

The provenance rules in the C++ standard are basically just a shrug emoji†, so it's unclear whether those are worse, I can see an argument for the idea that obeying Aria's strict provenance experiment rules in Rust is easier - not because it's easy (although for many common cases it is) but because at least these are coherent rules.

† U+1F937 person shrugging

umanwizard 1093 days ago

The core value proposition of rust is that it’s memory safe by default, and it’s possible to limit the set of code that needs to be manually checked for UB. This isn’t the case for C++, as any code anywhere can invoke undefined behavior.

pjmlp 1093 days ago

True, as long static analysers aren't part of the build, at which time specific constructs can be made to break the CI/CD build, forcing everyone to play by the rules if they want the PR to go through.

It isn't perfect, but does improve a lot the security baseline.

ammar2 1093 days ago

Sure but you're missing the

> so long as you are diligent about checking invariants

part. Could you go through and check all the parts of a huge C++ codebase to make sure invariants are held as opposed to a few hundred lines of unsafe Rust code?

mr_00ff00 1093 days ago

Sure, but I think the point here is the degree.

Presumably if it takes a lot of unsafe rust lines to build something, it won’t matter if it’s 30% safe or whatever.

I just see the point of “unsafe is fine” a lot when the whole point of rust is that memory safety issues are never worth the cost.

ammar2 1093 days ago

Right, I guess the question is what will that proportion be when Rust is used for things like operating systems and web browsers. 30% would be untenable but a few hundred/thousand lines of unsafe code is fairly easy to put under a microscope.

For some current day research into this, there is the paper "How Do Programmers Use Unsafe Rust?"[1] which I'll drop a quote from here:

> The majority of crates (76.4%) contain no unsafe features at all. Even in most crates that do contain unsafe blocks or functions, only a small fraction of the code is unsafe: for 92.3% of all crates, the unsafe statement ratio is at most 10%, i.e., up to 10% of the codebase consists of unsafe blocks and unsafe functions

That paper is definitely worth reading and goes into why programmers use unsafe. e.g 5% of the crates at that time were using it to perform FFI.

In writing "RUDRA: Finding Memory Safety Bugs in Rust at the Ecosystem Scale" [2], I recreated this data and year-by-year the % of crates using unsafe is going down. And for what it's worth, crates are probably a bad data-set for this. crates tend to be libraries which are exactly where we would expect to find unsafe code encapsulated to be used safely. There's also plenty of experimental and hobby crates. A large dataset of actual binaries would be way more interesting to look at.

[1] https://dl.acm.org/doi/10.1145/3428204

[2] https://taesoo.kim/pubs/2021/bae:rudra.pdf

Gwypaas 1093 days ago

Or Rust in Android, in this deep dice gaining two places of unsafe code which found a bug in the existing implementation due to the vetting triggered by being the only two places.

https://security.googleblog.com/2022/12/memory-safe-language...

mr_00ff00 1093 days ago

Ahh that is quite interesting, I’ll check those links out

CJefferson 1093 days ago

Looking at a couple of programs I work on:

9,500 lines of code, 8 are unsafe.

7,000 lines of code, 22 are unsafe.

14,000 lines of code, 140 are unsafe.

As we follow the standard rust rule that "safe code should not be able to use unsafe code to do unsafe things", those unsafe bits of code have been very carefully checked, to the best of our abilities, to ensure they don't create memory safety issues. It is a lot easier to triple-check 170 lines of code than 30,000 lines.

Animats 1093 days ago

Sharpview, my metaverse viewer: 36,000 lines, 0 are unsafe.

I use some published crates that have unsafe code, but my own programs start with

    #![forbid(unsafe_code)]

This is 60FPS 3D high-detail graphics stuff, where performance matters.

cyber_kinetist 1092 days ago

Are you using wgpu for the rendering stuff? Heard that WebGPU had to sacrifice some performance in order to make the API safer for the web (like more bounds checking and sanity checks). These kinds of issues are actually plaguing projects like Tensorflow.js (for example see https://github.com/gpuweb/gpuweb/issues/1202).

Other libraries like Vulkan and DirectX 12 are fundamentally unsafe in the API level, so direct usage of it would lead to heaps of unsafe Rust code. Rust people have tried wrapping it in a safe way (like gfx-rs and vulkano) but nowadays most seem to have transitioned to wgpu (since WebGPU API is safe by design so it fits more for the Rust ecosystem).

Rust does sacrifice some performance in general in order to achieve its safety claims, but people are happy with it so far, since the majority of applications using Rust (like CLI apps and web servers) don't have to squeeze out performance that much (for webdev there are too many things that can cause performance issues other than not writing it in Rust). But for 3D graphics people can be more sensitive about these problems. Though maybe if you're not developing a triple-A game with the latest cutting-edge graphics (with new techniques like "hardware ray tracing" and "bindless descriptors", which are both impossible in wgpu), writing in Rust can be a good-enough tradeoff for your needs.

Animats 1092 days ago

WGPU is just finishing up a major reorganization of locking and internal memory management, going from a global lock to fine-grained Arc reference counts.[1] Change log, just posted a few minutes ago: "Arcanization of wgpu core resources: Removed 'Token' and 'LifeTime' related management, removed 'RefCount' and 'MultiRefCount' in favour of using only 'Arc' internal reference count, removing mut from resources and added instead internal members locks on demand or atomics operations, resources now implement Drop and destroy stuff when last 'Arc' resources is released, resources hold an 'Arc' in order to be able to implement Drop, resources have an utility to retrieve the id of the resource itself, removed all guards and just retrive the 'Arc' needed on-demand to unlock registry of resources asap removing locking from hot paths."

From a performance standpoint, I'm much more concerned about being able to get all the CPUs working on the problem than slight improvements in per-CPU performance. My metaverse viewer has slow frames because loading content into the GPU from outside the rendering thread blocks the rendering thread. All that "ARCcanization" should fix that.

(I'm a user of WGPU, not one of its developers.)

[1] https://github.com/gfx-rs/wgpu/pull/3626/files/5b34df5a2e6f9...

cyber_kinetist 1092 days ago

A counterpoint that makes this argument a bit weaker: Rust’s “unsafe” marker doesn’t pollute only its scope and actually pollutes the whole module; You need to make sure that the invariants in unsafe code are met even in safe code. (An explanation of this in the Rustonomicon: https://doc.rust-lang.org/nomicon/working-with-unsafe.html)

So there’s quite a lot more code to actually check then what some of the Rust proponents are saying. One can say that C++ is still worse in this regard (theoretically you need to check 100% of your code to be safe in C++). But for some minority of developers who frequently needs to delve into unsafe code, the advantages of Rust might seem a bit more disappointing (“the compiler doesn’t really do that much for the more important stuff…”)

kaba0 1092 days ago

> whole point of rust is that memory safety issues are never worth the cost

I don’t think that it would be the point of rust — otherwise why not write Java, or a litany of GCd languages instead?

Rust is a low-level/systems programming language where you have more control over the program’s execution (e.g. no fat runtime), which is a necessity in some rare, niche, but important use cases.

dralley 1091 days ago

It almost never takes a lot of unsafe to build something. Even the redox OS kernel is only like 10% unsafe.

Ygg2 1093 days ago

It's not what unsafe means. Unsafe means this might cause UB for some invocations (accessing raw pointers, calling into another language, etc.). Safe means it will not cause UB for any invocations (it may panic or abort).