Hacker News new | ask | show | jobs
by Strilanc 271 days ago
Every one of these "performance tricks" is describing how to convince rust's borrow checker that you're allowed to do a thing. It's more like "performance permission slips".
6 comments

You don't have to play this game - you can always write within unsafe { ... } like in plain old C or C++. But people do choose to play this game because it helps them to write code that is also correct, where "correct" has an old-school meaning of "actually doing what it is supposed to do and not doing what it's not supposed to".
That just makes it seem like there's no point in using this language in the first place.
Dont let perfect be the enemy of good.

Software is built on abstractions - if all your app code is written without unsafe and you have one low level unsafe block to allow for something, you get the value of rust for all your app logic and you know the actual bug is in the unsafe code

This is like saying there’s no point having unprivileged users if you’re going to install sudo anyway.

The point is to escalate capability only when you need it, and you think carefully about it when you do. This prevents accidental mistakes having catastrophic outcomes everywhere else.

I think sudo is a great example. It's not much more secure than just logging in at root. It doesn't really protect malicious attackers in practice. And it's more of an annoyance than it protects against accidental mistakes in practice.
Unsafe isn’t a security feature per se. I think this is where a lot of the misunderstanding comes from.

It’s a speed bump that makes you pause to think, and tells reviewers to look extra closely. It also gives you a clear boundary to reason about: it must be impossible for safe callers to trigger UB in your unsafe code.

That's my point; I think after a while you instinctly repeat a command with sudo tacked on (see XKCD), and I wonder if I'm any safer from myself like that?

I'm doubtful that those boundaries that you mention really work so great. I imagine that in practice you can easily trigger faulty behaviours in unsafe code from within safe code. Practical type systems are barely powerful enough to let you inject a proof of valid-state into the unsafe-call. Making a contract at the safe/unsafe boundary statically enforceable (I'm not doubting people do manage to do it in practice but...) probably requires a mountain of unessential complexity and/or runtime checks and less than optimal algorithms & data structures.

Because only lines marked with unsafe are suspicious, instead of every line of code.

Also the community culture matters, even though static analysis exists for C since 1979, it is still something we need to force feed many developers on C and C++ world.

This is an issue that you would face in any language with strong typing. It only rears its head in Rust because Rust tries to give you both low-level control and strong types.

For example, in something like Go (which has a weaker type system than Rust), you wouldn't think twice about, paying for the re-allocation in buffer-reuse example.

Of course, in something like C or C++ you could do these things via simple pointer casts, but then you run the risk of violating some undefined behaviour.

In C I wouldn't use such a fluffy high-level approach in the first place. I wouldn't use contiguous unbounded vec-slices. And no, I wouldn't attempt trickery with overwriting input buffers. That's a bad inflexible approach that will bite at the next refactor. Instead, I would first make sure there's a way to cheaply allocate fixed size buffers (like 4 K buffers or whatever) and stream into those. Memory should be used in a allocate/write-once/release fashion whenever possible. This approach leads to straightforward, efficient architecture and bug-free code. It's also much better for concurrency/parallelism.
> In C I wouldn't use such a fluffy high-level approach in the first place.

Sure, though that's because C has abstraction like Mars has a breathable atmosphere.

> This approach leads to straightforward, efficient architecture and bug-free code. It's also much better for concurrency/parallelism.

This claim is wild considering that Rust code is more bug-free than C code while being just as efficient, while keeping in mind that Rust makes parallelism so much easier than C that it's stops being funny and starts being tragic.

I'm not even sure what it means for a language to "have" abstractions. Abstractions are created by competent software engineers, according to the requirements. A language can have features that make creating certain kinds of abstractions easier -- for example type-abstractions. I've stopped thinking that type abstractions are all that important. Somehow creating those always leads to decision paralysis and scope creep, and using those always leads to layers of bloat and less than straightforward programs.
& and &mut are pretty fundamental Rust abstractions.
I can tell that were plenty of libraries doing that in the 2000's, back when it was common to write enterprise software in C.

Plenty of abstraction possible using TU as modules, and applying Abstract Data Types design, while following Yourdon structured method with C.

> straightforward, efficient architecture and bug-free code

The grace with which C handles projects of high complexity disagrees.

You get a simple implementation only by ignoring edge cases or improvements that increase complexity.

The idea that a language can handle any complexity for you is an illusion. A language can automate a lot of the boring and repetitive small scale work. And it can have some of the things you would have otherwise coded yourself as built-ins. However you still have to deal with the complexity caused by buying into these built-ins. The larger a project gets the more likely the built-ins are to get in the way, and the more likely you are to rewrite these features or sub-systems yourself.

I'd say, more fully featured languages are most useful for the simpler side of projects (granted some of them can scale quite a way up with proficient use).

Now go research how some of the most complex, flexible, and efficient pieces of software are written.

> The idea that a language can handle any complexity for you is an illusion

I think this is wrong on its face. We wouldn't see any correlation between the language used and the highest complexity programs achieved it in.

As recently mentioned on HN it takes huge amounts of assembly to achieve anything at all, and to say that C doesn't handle any of the complexity you have to deal with when writing assembly to achieve the same result is absurd.

EDIT: > Now go research how some of the most complex, flexible, and efficient pieces of software are written.

I'm quite aware. To say that the choice of say, C++ in the LLVM or Chromium codebase doesn't help deal with the complexities they operate over, and that C would do just as well at their scale... well, I don't think history bears that out.

No, C doesn't actually handle the _complexity_ of writing assembly. It abstracts and automates a lot of the repetitive work of doing register allocation etc -- sure. But these are very local issues -- I think it's fair to say that the complexity of a C program isn't really much lower than the equivalent program hand-coded in assembler.

I'm not sure that LLVM would be the first consideration for complex, flexible, efficient? It's quite certainly not fast, in particular linking isn't. I'm not sure about Chromium, it would be interesting to look at some of the more interesting components like V8, rendering engine, OS interfacing, the multimedia stack... and how they're actually written. I'd suspect the code isn't slinging shared_ptr's and unique_ptrs and lambdas and is keeping use of templates minimal.

I would have thought of the Linux kernel first and foremost. It's a truly massive architecture, built by a huge number of developers in a distributed fashion, with many intricate and highly optimized parts, impressive concurrency, scaling from very small machines to the biggest machines on the planet.

Have you written a linker before? It sounds like you’re describing I/O but that isn’t the work being done here by the linker.
I've considered writing one. Why do you think what I'm describing here only applies to I/O (like syscalls)? And, more abstractly speaking -- isn't everything an I/O (like input/output) problem?
Because the data structures for a linker aren’t dealing with byte buffers. I’m pretty sure you’ve got symbolic graph structures that you’re manipulating. And as they mention, sometimes you have fairly large object collections that you’re allocating and freeing and fast 4k allocations don’t help you with that kind of stuff. Consider that the link phase for something like Chrome can easily use 1gib of active memory and you see why your idea will probably fall flat on its face for what Wild is trying to accomplish in terms of being super high performance state of the art linker.
No, without more context I don't immediately see how it would fall flat. If you're dealing with larger data, like Gigabytes of data, it might be better to use chunks of 1 MB or sth like that. My point is that stream processing is probably best done in chunks. Otherwise, you have to to fit all the transformed data in RAM at once -- even when it's ephemeral. Wasting Gigabytes without need isn't great.
> in something like C or C++ you could do these things via simple pointer casts

No you don't. You explicitly start a new object lifetime at the address, either of the same type or a different type. There are standard mechanisms for this.

Developers that can't be bothered to do things correctly is why languages like Rust exist.

And that is safer... how?

   Foo foo{}; init(*(Bar *)foo);
is UB in most cases (alignment aside, if Bar is not unsigned char, char, std::byte or a base class of Foo). This is obvious why, Foo and Bar may have constructors and destructors. You should use construct_at if you mean to;

For implicit-lifetimes types (iirc types with trivial default constructors (or are aggregates) plus trivial destructors), you can use memcpy, bit_cast and soon std::start_lifetime_as (to get a pointer) when it is implemented.

If I'm not mistaken, in C, the lifetime rules are more or less equivalent to implicitly using C++'s start_lifetime_as

Ironically, Rust doesn't need any of that, you literally can just cast to a different pointer type between arbitrary types and start using it without it inherently being UB (you know, as long as your access patterns are more generally valid).
That's because Rust doesn't have constructors/assignment operators, is it not? Because of that, all objects are trivially relocatable (sort of, because Rust allows destructors for these objects, I guess).

And strict aliasing is not a concern due to Rust's aliasing models, thus the combination of the two makes it safe to type-pun like that. But Rust's models has its downsides/is a tradeoff, so...

I don't particularly mind the C++ object model (since C++20), it makes sense after all: construct your object if it needs to, or materalize it through memcpy/bit_cast. std::start_lifetime_as should fix the last remaining usability issue about the model.

When I read articles like this, I just relish how much Go, Zig, and Bun make my life to much easier in terms of solving performance issues with reasonable trade-offs.
It is more of a culture thing, most compiled languages have been fast enough for quite some time.

People using systems languages more often than not go down the rabbit hole of performance tuning, many times without a profiler, because still isn't the amount of ms that is supposed to be.

In reality unless one is writing an OS component, rendering engine, some kind of real time constrained code, or server code for "Webscale", the performance is more than enough for 99% of the use cases, in any modern compiler.

...Except that Rust is thread-safe, so expressing your algorithm in terms that the borrow checker accepts makes safe parallelism possible, as shown in the example using Rayon to trivially parallelize an operation. This is the whole point of Rust, and to say that C and C++ fail at thread-safety would be the understatement of the century.
Memory safe doesn't imply memory efficient and reasonable. Wrapping everything in Arc is the opposite of simple & easy.
If you're writing C code that shares memory between threads without some sort of synchronization primitive, then as your doctor I'm afraid I'm going to have to ask you to sit down, because I have some very bad news for you.
I'm good, thanks.
Yup -- yet another article only solving language level problems instead of teaching something about real constraints (i.e. hardware performance characteristics). Booooring. This kind of article is why I still haven't mustered the energy to get up to date with Rust. I'm still writing C (or C-in-C++) and having fun, most of the time feeling like I'm solving actual technical problems.
This was an article distilled from a talk at a Rust developers conference. Onbviously it’s going to make most sense to rust devs, and will seem unnecessary to non-Rust devs.
The rayon thing is neat.