Hacker News new | ask | show | jobs
by tialaramex 2 days ago
> If they rewrote it in C++ again, they would have most likely got the same result because they got a chance to fix a design that might not have been most optimal.

This speculation has been offered every time. It's not crazy to think this might be true, but it's also not crazy to think that if C++ keeps leaving performance on the table and Rust doesn't that adds up for real projects.

When Titus wrote "ABI: Now or Never" in 2020 he estimated 5-10% aggregate loss. Things that you could fix, if you started over, but C++ refuses to do that because of ABI and so it doesn't have these fixes, whereas in most cases† Rust does. So I can well believe that a blow-for-blow port could get you 10% perf win.

† One of the examples Titus cites is the "Small String Optimization". Rust deliberately doesn't do SSO for its standard library collection String, but several really nice SSO optimised types are available, including ColdString and CompactString, which are way better than what's provided in C++ if that's what you need.

2 comments

Until Rust has equal meta-programming support to C++ it's always going to be "slower". That's why people always say this because it's always true there is nothing Rust can do C++ can't but there is quite a few things you can do in C++ but not in Rust.

Realistically the difference doesn't matter much and if you're writing code that must be as fast as possible your writing unsafe Rust that looks a lot more like C/C++ then anything Rust.

> Until Rust has equal meta-programming support to C++ it's always going to be "slower".

What metaprogramming does C++ have that rust is lacking?

If you need more than traits + generics, rust also has proc macros. Proc macros are essentially portable compiler extensions. They take in a stream of symbols from the user's program at compile time and emit rust code that gets passed straight to the compiler. You lose out on syntax highlighting and they make compilation slower. But macros are essentially compile-time code gen. They work great. In rust, you can do things like JSX at compile-time without any special compiler support. (See: leptos.)

> Realistically the difference doesn't matter much and if you're writing code that must be as fast as possible your writing unsafe Rust that looks a lot more like C/C++ then anything Rust.

I agree that the difference is small in practice. Good rust often does look a lot like C - with plain structs everywhere and lots of global-ish scoped functions.

But I don't agree about unsafe. I've spent some time porting well optimised code from C to rust. I generally find I need far less unsafe code than I expected. I ported a ~500 line skip list implementation from C to rust a few years ago. I think my rust code ended up using just 2 unsafe functions. The rest of the code didn't need any unsafe at all.

My skip list was a monster to debug in C because most logic bugs ended up corrupting memory. As a result, a bug in one function caused crashes in far away code. In rust, debugging was much easier. There wasn't any "spooky action at a distance". And that let me reason about the code much more easily. As a result, once I got it working I ended up adding a few more optimisations in rust that I was too overwhelmed to write in C. The rust code is now ~2-3x faster.

If you're interested:

https://github.com/josephg/jumprope-rs#benchmarks

C code is here: https://github.com/josephg/librope

> What metaprogramming does C++ have that rust is lacking?

Compile time execution, and compile time reflection, with the same syntax.

Proc macros are still a kludge having to depend on syn crate, and some stuff used to depend on nightly, is that still the case, I don't keep track?

Additionally type specialisation, and explicit templates.

Proc macros haven't depended on nightly things in a very, very very long time.
Thanks for the update, outside some weekend experiments, or having to compile from source some specific tools, I don't have much reasons to write Rust, thus not keeping that much other than conference talks that pop up on my feeds.
> there is nothing Rust can do C++ can't but there is quite a few things you can do in C++ but not in Rust

The point the parent comment is making is that this doesn't really matter if no one actually does do these things in C++. It's absolutely wild to me how quickly the arguments in favor of C++ instead of Rust have reversed in about a decade; people used to argue that the benefits of Rust were all theoretical and in practice people who were used to C++ could write it perfectly fine without safety issues, and now it's somehow that the theory that gives C++ the advantage and we don't need to care about whether those supposed advantages ever actually exist in practice.

> Until Rust has equal meta-programming support to C++ it's always going to be "slower".

I don't think that makes a lot of sense even theoretically because of e.g. aliasing, but it doesn't matter because as I said, C++ chooses to be slower, Titus gives a number of examples where we know how to do X fast, and that's how Rust does X - in theory C++ could X the same way, but none of the three C++ compilers people actually use do it, because they picked wrong and then froze their ABI and won't thaw it.

No one writing anything that needs performance cares about some standard library ABI issues. Rust already has warts from bad API designs that constrains performance and they are unlikely to ever be fixed even with new editions. Rust will continue to pick up baggage as basically every language has done.

Aliasing has yet to provide any real benefit for Rust and a hell of a lot of issues. Maybe one day it will be a big win but realistically anyplace that aliasing matter c/c++ will just drop __restrict on it.

> Rust already has warts from bad API designs that constrains performance and they are unlikely to ever be fixed even with new editions.

Like what?

> Aliasing has yet to provide any real benefit for Rust and a hell of a lot of issues.

Yeah, the performance wins so far are quite small. But rust's noalias-by-default did unearth a whole lot of latent bugs in LLVM. Even if you don't care about rust, its great that rust led LLVM to track down and fix these bugs. They affected C/C++ code too.

> realistically anyplace that aliasing matter c/c++ will just drop __restrict on it.

Is there a way to tell? When I'm writing C, I have no idea if using restrict will help other than staring at the assembly. (Or just trying it). I'm also leery of using restrict in C because its so hard to audit callers. How do you know when restrict is safe?

> When I'm writing C, I have no idea if using restrict will help other than staring at the assembly.

It's also a statement, whether you want callers to pass the same object to different parameters. If that doesn't make sense or you don't want that, write restrict.

>ColdString and CompactString, which are way better than what's provided in C++

Could you elaborate on that?

Sure, to simplify lets assume a 64-bit CPU (this all works for 32-bit but that's less common these days and the actual numbers are different)

C++ std::string can contain up to 15 (other popular implementations) or 22 bytes (libc++ from Clang) of inline text, and the data structure itself is either 24 bytes (Clang again) or 32 bytes of storage. Here's Raymond Chen: https://devblogs.microsoft.com/oldnewthing/20240510-00/?p=10...

CompactString is 24 bytes of storage with all 24 bytes as potential inline text. When the 24 bytes are valid UTF-8 text, then that's the content of the CompactString e.g. "https://example.org/cool", if they aren't the last byte will be invalid UTF-8, and this signals whether some of the other 23 bytes were inline UTF-8 (and if so how many) or whether they should be interpreted as a pointer, size and capacity.

ColdString is a radically different idea, it's 8 bytes of unaligned storage and it's one of three things: 1. 8 bytes of UTF-8 text, as before we can tell by whether it's valid UTF-8 text or 2. 0-7 bytes of UTF-8 text, prefixed by an invalid UTF-8 byte telling us how many of the remaining bytes are text or 3. An encoded pointer to a length-prefixed data structure, signalled by the presence of the UTF-8 continuation marker bits which should never be present in the first byte of a string.

I really like ColdString because it's so much in the "use the whole buffalo" spirit of these modern safe yet high performance types. UTF-8 has what are called "overlong prefixes" because it was invented before Unicode decided it would never grow beyond U+10_FFFF and these are often just a useless impediment, but ColdString uses those prefixes.

Thanks!

How do CompactString/ColdString compare to std::string implementations performance-wise? From the looks of it, they must be somewhat slower than C++ strings

I do not have hard numbers - however keep in mind that practical "performance" also includes memory bandwidth and total RAM, this is especially a consideration for the ColdString type - a billion ColdStrings is 8GB of RAM, but a billion MSVC std::string needs 32GB of RAM. Rust's std::string::String is of course much faster than any of the std::string implementations because it never has the SSO case to consider - but for non-empty strings it's also more memory bandwdith and RAM needed.