Hacker News new | ask | show | jobs
by softirq 1089 days ago
> Low-level Operating Systems Sr. User Experience Researcher

Wow, I didn't even know this job existed. IMO Rust as a C++ replacement is fine, Rust as a C replacement has more trade-offs than I still care to make. C is still far simpler (you can still read K&R in one day and keep most of the language in your head), has faster compile times, and the pain points cough macros are still often pain points in Rust.

I think the biggest thing is that systems programming still requires a language that gets out of the way so you can focus on very technical problem domains where what the hardware is actually doing really matters. Rust is a language designed to get in your way and force you to create type abstractions. Adding too many abstraction can be exceedingly dangerous in an environment where not having a full view of how memory and hardware registers are laid out leads to even worse errors than just buffer overflows. IMO Rust makes this type of programmer more difficult just as C++ does.

11 comments

The "simplicity" of C is not a good thing. The Brainfuck language is even "simpler" and you can read the spec in 2 minutes. But that does not make it easier, because all the complexity is in the usage.

The abstraction layer that one can build with rust allow the programmer to actually focus of the actual business logic instead of trying to get low level details right.

I think you missed the part about systems programming. When you are programming real hardware, the focus is entirely on low level details. You need to know the commands being sent to the device, the device state, and the ownership of resources by the device (which Rust doesn't solve for) are correct.

The innovation of Rust is the borrow checker, which is primarily of interest to systems programmers. If your primary interest is highly abstracted business logic, there are tools that don't require manual memory management or being pedantic about the different types of strings. You could just use Go, Java, Haskell, Python, etc.

> there are tools that don't require manual memory management or being pedantic about the different types of strings. You could just use Go, Java, Haskell, Python, etc.

1. Rust doesn't force you to do manual memory management. Rust memory management is automatic by default and only if you really, really want to, you can do it manually.

2. Memory is not the only resource. The GCs in languages you listed only solve the memory management problem, but regarding the other types of resources, their ergonomics are often worse than C - you have to remember to close the resources manually and you get virtually no help from the compiler.

3. None of the listed languages address problems related to concurrency, e.g. data races. Ok, Haskell kinda avoids the problem by imposing other restrictions - by not allowing mutation / side effects ;)

4. Rust offers way better tools for building high level abstractions than Go, Python and Java. It has set a very high bar with algebraic data types, pattern matching, traits/generics and macros.

1. Rust does manual memory management. It has some syntactic sugar for it in the form of a compiler-enforced RAII, but that is still manual memory management for all practical purposes. A good distinction to make is whether low-level, memory/ownership details leak into public APIs. This is trivially true for Rust, while is not true of managed languages.

3. Rust only addresses problems related to data races, not as an example. All the other race conditions are still on the table. It is a good thing to have, but I think they are the least problematic, and easiest to solve part of concurrency issues.

4. All of these have been known for like 3 decades. There are plenty of managed languages with these, ML, OCaml, Haskell, Scala. But I think your claim is subjective at best.

> It has some syntactic sugar for it in the form of a compiler-enforced RAII, but that is still manual memory management for all practical purposes.

Then you've got a different definition of "manual" than mine. Manual means that developer has to insert calls to allocate / deallocate memory and that the developer is responsible for proving the correctness of those calls. Automated means those calls are done by the runtime or by the compiler automatically, and the compiler makes sure they are correct. In case of Rust, those calls are inserted automatically by the compiler.

> memory/ownership details leak into public APIs

The fact that ownership is a part of public API is a good thing, similarly how it is a good thing to specify an argument is an integer and not a string.

> There are plenty of managed languages with these, ML, OCaml, Haskell, Scala.

I referred to the ones mentioned in the above comment, which mentuoned Java/Go/Python. Haskell/Scala/Ocaml/ML are quite niche even compared to Rust these days.

But even though Haskell / Scala might get close on some type-system features, they don't offer similar experience as Rust in other areas. Haskell is more restrictive in terms of managing state than borrow-checker, and Scala tooling / compile times has been always horrible.

> Rust only addresses problems related to data races, not as an example. All the other race conditions are still on the table.

This is like saying a statically typed language doesn't stop you from putting a string telephone number into a string surname field. Sure it doesn't. But despite that, the value of static types is hard to overestimate.

In practice, the borrow checking + RAII + Send/Sync rules can be used to make the other types of concurrency problems very unlikely by properly modeling the APIs. Sure, no language can protect from all concurrency problems in general, but at least Rust gives you some good tools. For instance it is trivial to forbid concurrent access to something that shouldn't be accessed concurrently and let the compiler enforce that. Now try enforcing that in your "business oriented language of choice". In my experience the majority of concurrency related problems in real large-scale software development happen when some code not designed to handle concurrency accidentally becomes executed concurrently because developers don't realize something is shared and mutated at the same time. Another type of common issue is with communicating concurrent threads of execution, when one sends a message but the receiver is not there on the other end because of premature exit e.g. due to error, leading to a deadlock. Rust protects from those really well.

> The fact that ownership is a part of public API is a good thing

A libraries next version which switches up some internal representations memory handling should ideally not mess up your application, but it also mandates a higher refactor rate when you are only working within your application’s boundaries. These are worthwhile tradeoffs for the niche rust is targeting, but not for every use case.

I’m not saying Rust is a bad language, I really like using it for its intended niche of complex applications where absolute control is needed, like a browser engine. But it is not a panacea and I would definitely not choose it for a CRUD webapp.

> “and the ownership of resources by the device (which Rust doesn't solve for) are correct.”

For embedded software, the underlying code basically reads and writes a bunch of registers. Unsafe memory access with side effects. The benefit of using rust here is that you can easily model these access patterns to make an api that cannot be abused. So the driver reads and writes addresses and the user code operates through the driver with all the benefits of ownership at hand to avoid race conditions and other foot-guns.

So, in this way rust does indeed “solve for” ownership of devices. You can’t have two threads (or interrupt handlers) mutating the same device without satisfying the ownership rules.

>You can’t have two threads (or interrupt handlers) mutating the same device without satisfying the ownership rules.

The trouble with this is that abstract 'devices' don't necessarily map neatly to the underlying hardware. Configuring peripherals on a typical microcontroller typically requires setting flags in a bunch of random registers which don't necessarily have neatly separated responsibilities.

Take PWM as an example. Is there a PWM 'device'? Is there a PWM setting for each port, according to some abstract representation of ports? What about the timer used to generate the PWM output? Does the PWM device own the timer, or does the timer own the PWM device? Any such abstractions cause more problems than they solve. You really just need to think carefully about how you are manipulating the underlying hardware.

In my experience a decent way to solve this is by two layers of abstractions. I will take any better design ideas!

First layer gives you safe access to the hardware registers. For example, ensure atomic/synchronized access, forbid invalid/reserved values. Name the flags/bits to reduce human mistakes (reg |= Prescaler::Div8.

You can still miss-configure the PWM/Timer settings of course.

The second layer gives you a safe driver interface. Giving you all the options to configure a timer for a some PWM settings for example.

Absolutely. I used to work in a C++ shop writing mission critical software. We would be far more concerned to let a newcomer to contribute on existing C++ codebase than on existing Rust codebase. The newcomer surely would have a harder time to make a pull request that compiles and passes tests in Rust than in C++, but that is a good thing for maintainers.
(At least there is no UB in Brainfuck..)
Several successor languages have kept the simple syntax of C while eliminating a large class of warts. Hindsight and not having to keep a wide reaching set of compiler standards let a language have simple syntax and not so many problems with UB.
> you can still read K&R in one day and keep most of the language in your head

Does this include all 193 cases of undefined behavior?

I've always been taught and memorized 197 are you forgetting some? Most from those lists are very easy since they are basically the same error played out slightly different.
I know you're just trashing C UB because that's fashionable, but if you really think about it there are two popular options, and you can't have both at the same time:

a) UB enables valuable optimizations and is important to keep (or even add) when performance matters

b) UB makes the language unusable/insecure to anyone but genius level experts and should be avoided

Whenever someone (including famous/relevant people like Dennis Ritchie [0], DJ Bernstein [1], or Linus Torvalds [2]) tries to suggest cleaning up, removing, or simply not adding new cases of undefined behavior in C/C++, the optimization experts come running from the other room screaming about how important it is that "signed integer overflow must be undefined" [3] or else things will run a percent more slowly (signed overflow being just one example of UB). Also there are people who suggest adding new UB to Rust [4].

So really, either Rust is significantly slower than C because Rust doesn't have the UB you're criticizing, or C could be a cleaner language without compromising on speed and the compiler writers and standards committees are wrong. You choose, but both options are considered heresy.

[0] https://www.lysator.liu.se/c/dmr-on-noalias.html

[1] https://groups.google.com/g/boring-crypto/c/48qa1kWignU

[2] https://lkml.org/lkml/2018/6/5/769

[3] https://youtu.be/yG1OZ69H_-o

[4] https://www.ralfj.de/blog/2021/11/24/ub-necessary.html

Rust does actually do the many of the same optimizations, and in unsafe rust this can and does lead to UB.

An example of this, is aliasing mutable references. Rust has been designed to assume that two mutable refs will never alias, if you attempt to do this it is instant UB. My understanding is that even creating the aliasing reference is UB, even if you don't use it.

Another example would be uninitialized values. In rust, the compiler assumes that values are always "valid" and initialized. Since you need to allocate memory before writing to it, you need some way to safely have uninit values, and this is what the MaybeUninit wrapper type is for. The wrapper allows you to safely have uninit values, and once you write to them you can tell the compiler that they are initialized, but if you tell the compiler before on accident, it is UB.

References also are guaranteed to point to initialized and valid data, and can never be null, though my understanding is that there is some uncertainty about the exact rules of this with regards to uninitialized values and the exact semantics may change in the future.

(There are also a lot more things that I don't know very much about!)

All of these things are assumed to never happen, and optimizations are performed based on that assumption.

The nice part about rust, is that it makes it impossible to represent invalid states using the type system!

For aliasing, you can only have a single live mutable reference at any time, attempting to create a second one while another is live is a compile time error!

For uninitialized values, you simply can't create uninitialized values at all in safe rust. My understanding is that the only way to create an uninitialized value is with the MaybeUninit type or by using raw pointers.

So rust still has heaps of UB, but it doesn't allow you to do it by default, so you still get some of the optimizations you'd expect.

I think there are some cases where rust is missing out on optimizations though, like with signed int overflow for example, and probably more that I don't know about!

I can see your enthusiasm for Rust, but my comment was more directed to whether criticisms of UB in C can be taken seriously considering suggestions to remove it where possible have almost always been shot down in the name of speed.

> I think there are some cases where rust is missing out on optimizations though, like with signed int overflow for example

Before you push for UB in safe Rust, I politely suggest you write your own benchmark for this in C or C++. Try a few combinations of { signed, unsigned } X { 32 bit, 64 bit } X { gcc, clang }, compiled at -O2 or higher, and see what you get for results. Maybe throw in -fwrapv for some of the runs. My own conclusion on my own benchmarks is that UB advocates are mostly wrong.

I think this is often a function of Rust OSS libs vice the language itself. The embedded Rust community has created and promoted bad APIs. I think it's worth pointing this out and building other APIs, even though this doesn't endear you to the Rust community. I'm worried people will get the wrong idea about embedded Rust ergonomics and attribute these APIs to the language itself.
Can you talk more about this? I'm very curious.
Well, for starters, allocators. The standard library is now making adjustments around that, but it's not mainstream and most people don't use them and they don't get the same level of attention as the happy path.

However, Rust, the language has its issues on embedded.

Rust's ownership model is directly at odds with lots of embedded. These bits inside a register are owned by the ADC and these bits inside a register are owned by the DAC is not a happy thing in Rust.

Lack of arbitrary sized integers and how they slice.

Cargo. Quite annoying to deal with cargo and an embedded toolchain. The Rust embedded guys have done really good work if you're on ARM. If you're not, good luck.

That having been said, if you have to go implement something like Reference Counting in something not Rust, you will weep tears of blood debugging every single time your reference counts go wrong.

Embedded is engineering. It has tradeoffs. That's life.

I do not have as strong of feelings as your parent, but:

1. A lot of the APIs make use of the typestate pattern, which is nice, but also very verbose, and might turn many people off.

2. The generated API documentation for the lower level crates relies on you knowing the feel for how it generates the various APIs. It can take some time to get used to, especially if you're used to the better documentation of the broader ecosystem.

3. A bunch of the ecosystem crates assume the "I am running one program in ring0" kind of thing, and not "I have an RTOS" sort of case. See the discussion in https://github.com/rust-embedded/cortex-m/issues/233 for example.

Every time I play with embedded Rust I get this feeling that some of the people driving it are more into language esthetics and don't have experience in real-world embedded systems.

For example, I see inefficient patterns that are common in frontend world but have no place in an embedded system being promoted as "proper" way of doing things.

Could you perhaps give an example.
The other replies reflect my thoughts, so I don't have much to add. So, these are elaborations on those:

It appears most of the peripheral support libs (eg those that use Embedded-HAL traits) are not designed with practical ends in mind; I've found it easier for all I2C/SPI etc devices I've used to start from scratch with a `setup` fn with datasheet references, than DMA transfers. So, you have these traits designed to abstract over busses etc; they sound nice in principle, but are (so far) not useful for writing firmware.

I get a general sense that the OSS libs are designed with a "let's support this popular MCU/IC, and take advantage of Rust's type system and language features!" mindset. A bare minimum is done, it's tested on a dev board, then no further testing or work. There are flaws that show up immediately when designing a device with the lib in question.

So, at least for the publicly-available things, they're designed in an abstract sense, instead of for a practical case.

>Adding too many abstraction can be exceedingly dangerous in an environment where not having a full view of how memory and hardware registers are laid out leads to even worse errors than just buffer overflows.

svd2rust is pretty good for having safe abstractions for hardware registers. That said, as an example, no, the type system doesn't prevent you from deallocating your DMA buffer while the hardware is using it--I don't think it's reasonable to add that to the type system (and the type system right now doesn't know about DMA).

Garbage in, garbage out! Svd2rust is a great tool, but the patching process (YAMLs) is currently not user friendly due to silent failures. Root cause is hardware makers putting out bugged SVDs that need patching.

I think re DMA buffer lifetimes, the easy approach is static buffers; they never drop.

C has beyond useless “macros”, they should not be compared with Rust’s, that are actually useful.
C's macros are primitive and unsafe but by no means useless. Here's a somewhat silly example from embedded programming. I wanted to embed the bitmaps of a small set of characters for use on a bitmapped monochrome display. It was easy to define macros CHAR_GRID, _ and X such that e.g.

    const uint8_t zero[] = {
      CHAR_GRID(
        _,X,X,X,_,
        X,_,_,_,X,
        X,_,_,X,X,
        X,_,X,_,X,
        X,X,_,_,X,
        X,_,_,_,X,
        _,X,X,X,_
      )
    };
desugared to a column-major array of 5 bytes.

    #define CHAR_GRID(c1r1, c2r1, c3r1, c4r1, c5r1, \
    c1r2, c2r2, c3r2, c4r2, c5r2, c1r3, c2r3, c3r3, c4r3, c5r3, c1r4, c2r4, c3r4, c4r4, c5r4, c1r5, c2r5, c3r5, c4r5, c5r5, c1r6, c2r6, c3r6, c4r6, c5r6, c1r7, c2r7, c3r7, c4r7, c5r7) \
 c1r1 | (c1r2 << 1) | (c1r3 << 2) | (c1r4 << 3) | (c1r5 << 4) | (c1r6 << 5) | (c1r7 << 6), \
 c2r1 | (c2r2 << 1) | (c2r3 << 2) | (c2r4 << 3) | (c2r5 << 4) | (c2r6 << 5) | (c2r7 << 6), \
 c3r1 | (c3r2 << 1) | (c3r3 << 2) | (c3r4 << 3) | (c3r5 << 4) | (c3r6 << 5) | (c3r7 << 6), \
 c4r1 | (c4r2 << 1) | (c4r3 << 2) | (c4r4 << 3) | (c4r5 << 4) | (c4r6 << 5) | (c4r7 << 6), \
 c5r1 | (c5r2 << 1) | (c5r3 << 2) | (c5r4 << 3) | (c5r5 << 4) | (c5r6 << 5) | (c5r7 << 6)
Something like the Flecs ECS (https://www.flecs.dev/flecs/) which makes Rust's Bevy team jealous makes extensive use of C's "useless macros".
While that’s true, preprocessors are pretty trivial to write these days if you want macros that the language doesn’t support. Racket excels at this.
Which preprocessor do you recommend for writing C?
None.

Use an actual different language. Ada, Rust, Zig, D, Lisp/Scheme/Racket, Tcl, Forth, etc. ... something other than C.

Don't preprocess C into a slightly broken other language that you wish it were. Use C as C, or use something else.

https://stackoverflow.com/a/3685576

There’s one approach. I wouldn’t personally recommend using macros outside of include guards and file inclusion. Still, if you need more functionality, the methods exist.

> you can still read K&R in one day and keep most of the language in your head

People who say this somewhat perplex me. Yes you can get the syntax of the language down in a day, but that does little to stop you from running into your first Bus Error or Segmentation Fault within the first 30 minutes of trying to write any software, not to mention all the hidden errors/exploits you've put in your code that are only a platform switch or a compiler version change away from being found explosively. And you can completely forget trying to write a multithreaded C application, which basically confines you to very slow single-threaded code, completely tanking performance versus even the slowest dynamic language that supports multithreading, erasing any advantage for using C.

This is not a personal attack but when I have to try to come up with an assumed background for people who say this it usually involves some assumptions that the person isn't keeping in touch with the "real world" of some sort. I have trouble rationalizing it otherwise. Thus I'll usually ask what their background is when they say this to try to make sense of things.

The only places C is still the optimal choice is where C is already being used or in extreme platforms where there aren't good toolchains (various ASICs/rare 8bit microprocessors). There's zero reason to use it otherwise.

> the pain points cough macros are still often pain points in Rust.

Hygenic syntax checked macros are an entirely different animal than just string insertion/substition macros. I don't think this comparison is fair.

You can and do use simple pointers in Rust, nothing prevents this.

The abstractions can be more used like static interfaces you want to reuse. E.g. a byte stream interface, a regmap interface, etc.

I think there have been a few posts pointing out that the ergonomics around using just pointers in Rust is lacking severely in comparison to languages like Zig and Odin.
> I think the biggest thing is that systems programming still requires a language that gets out of the way so you can focus on very technical problem domains where what the hardware is actually doing really matters.

Just the fact that Rust doesn’t do implicit integer conversions is by itself a huge win over C which has promotion rules that can easily trip you up when you are trying to exactly specify bits.

> You can still read K&R in one day and keep most of the language in your head

Except that isn't what most compilers expose, including the UB semantics.

One is in for a sea of surprises when trying to write portable C code and using K&R C as language reference.

Are tracking memory allocation or variable types not pain points for large complicated programs?
In my experience as a kernel programmer tracing allocations isn't the hard part, it's keeping the correct view of hardware state and the different type of mappings at play, be it an MMU or IOMMU device mapping, and register state. Use after free bugs and overflows do happen, but there are more and more tools that come out every year that can find these things in C code, some of them are even hardware based. IMO the code quality of the kernel is very high and the defect rate isn't greater than projects I've worked on that use garbage collection.
Try Zig as a C replacement.
Does Zig run on everything C can, including weird devices?[0]

If the answer is still No, then it can't replace C.

0: https://en.wikipedia.org/wiki/Small_Device_C_Compiler

Zig has been able for a while now to compile to C code, which is part of our bootstrap procedure now, so you can compile that output with an appropriate C compiler and target whatever you want.

We do have plans for adding backends for unconventional targets eventually.

I'm not really looking for a replacement for C, the ecosystem of system programming is still really C centric, it would take a large shift in the industry for me to justify investing in another language. I've dabbled with Rust only because early support was merged into Linux. IMO most languages are only marginal improvements over the previous generation that don't outweigh fighting against the entrenchment of expertise, documentation, and integration that come with established languages. It's also hard to be a true expert at a low level domain and multiple programming languages unless you're willing to give up all of your free time.
Marginal improvements add up.
I agree, at some point they do build up to the point that change is inevitable. There's usually a lot of false starts along the way (AI is a good example of this). No matter what it is, there's going to be a lot of fighting against entrenchment and a lot of us old timers simply have to retire for newer developers, whether they be Rust developers to come in and elicit change.