Hacker News new | ask | show | jobs
by smolder 605 days ago
I rewrote the same web API in Javascript, Rust, C#, and Java as a "bench project" at work one time. The Rust version had smallest memory footprint by far as well as the best performance. So, no, "any other language" [than JS] is not all the same.
4 comments

C# and Java are closer but not really on the level of Rust when it comes to performance. A better comparison would be with C++ or a similarly low-level language.

In my experience, languages like Ruby and Python are slower than languages like Javascript, which are slower than languages like C#/Java, which are slower than languages like C++/Rust, which are slower than languages like C and Fortran. Assembly isn't always the fastest approach these days, but well-placed assembly can blow C out of the water too.

The ease of use and maintainability scale in reverse in my experience, though. I wouldn't want to maintain the equivalent of a quick and dirty RoR server reimplemented in C or assembly, especially after it's grown organically for a few years. Writing Rust can be very annoying when you can't take the normal programming shortcuts because of lifetimes or the borrow checker, in a way that JIT'ed languages allow.

Everything is a scale and faster does not necessarily mean better if the code becomes unreadable.

> A better comparison would be with C++ or a similarly low-level language.

Right, but then I'd have to write C++. Shallow dismissal aside (I really do not enjoy writing C++), the bigger issue is safety: I am almost certain to write several exploitable bugs in a language like C++ were I to use it to build an internet-facing web app. The likelihood of that happening with Rust, Java, C#, or any other memory-safe language is much lower. Sure, logic errors can result in security issues too, and no language can save you from those, but that's in part the point: when it comes to the possibility of logic errors, we're in "all things being equal" territory. When it comes to memory safety, we very much are not.

So that pretty much leaves me with Rust, if I've decided that the memory footprint or performance of Java or C# isn't sufficient for my needs. (Or something like Go, but I personally do not enjoy writing Go, so I wouldn't choose it.)

> Everything is a scale and faster does not necessarily mean better if the code becomes unreadable.

True, but unreadable-over-time has not been my experience with Rust. You can write some very plain-vanilla, not-"cleverly"-optimized code in Rust, and still have great performance characteristics. If I ever have to drop into 'unsafe' in a Rust code base for something like a web app, most likely I'm doing it wrong.

Rust provides better tools to handle logic errors as well. Sum types/exhaustive pattern matching, affine typing, and mutability xor aliasing let you model many kinds of real-world logical constraints within the type system. (And not just theoretically -- the teams and projects I work on use them every day to ship software with fewer bugs than ever.)
I'd even argue that idiomatic Rust is less prone to those "logic errors" than C++ and the language design gives you fewer chances to trip over yourself.

Even the basics, nobody is calling Rust's [T]::sort_unstable without knowing it is an unstable sort. Even if you've no idea what "stability" means in this context you are cued to go find out. But in C++ that is just called "sort". Hope you don't mind that it's unstable...

[Edited because I can't remember the correct order of words apparently.]

> when it comes to the possibility of logic errors, we're in "all things being equal" territory. When it comes to memory safety, we very much are not.

Very well summed. I'll remember this exact quote. Thank you.

My goal with the project was to compare higher performance memory safe languages to Javascript in terms of memory footprint, throughput, latency, as well as the difficulty of implementation. Rust was, relatively speaking, slightly more difficult: because concurrently manipulated data needed to be explicitly wrapped in a mutex, and transforming arbitrary JSON structures (which was what one of the endpoints did) was slightly more complex than in the others. But, overall, even the endpoints that I thought might be tricky in Rust weren't really what I'd call difficult to implement, and it wasn't difficult to read either. It seemed worth the trade-off to me and I regret not having more opportunities to work with it professionally in the time since.
C and Fortran are not faster than C++, and haven't been for a long time. I've used all three languages in high-performance contexts. In practice, C++ currently produces the fastest code of high-level languages.
Because C++ doesn't restrict aliasing there are a bunch of cases where it's just unavoidably worse. The compiler is obliged to assume that if there are potentially aliasing objects of type T: T1 and T2 then mutating T1 might also mutate T2 (because it may be an alias), so therefore we must re-fetch T2.
That is more theory than reality in high-performance code, and was noted as such even back when I was in HPC. The compiler isn’t stupid and normal idiomatic high-performance code in C++ has codegen that is essentially indistinguishable from the FORTRAN in virtually all cases. It has been a couple decades and many compiler versions since anyone had to worry about this. One of the things that killed the use of FORTRAN in HPC is that it empirically did not produce code that was any faster than C++ in practice and was much more difficult to maintain. Advantage lost.

The extensive compile-time metaprogramming facilities in C++ give it unique performance advantages relative to other performance languages, and is the reason it tends to be faster in practice.

Generally, the reason C++ is so stupidly fast compared to even C is because a lot is pushed to compile-time via templates. You can avoid passing pointers, doing indirection, and you can even inline functions altogether. Flattening objects and methods to encode as much information as you can in the type at compile-time will almost always be much faster than doing dynamic redirection at runtime.

For example, compare the speed and implementation of std::sort and qsort (it's almost an order of magnitude difference in run time for big N!)

Sure, but note that unlike the aliasing overhead the C programmer can just specialise by hand to get the same results.

Also, sorting is something where algorithmic improvement makes a sizeable difference so you need to be sure you're either measuring apples vs apples or that you've decided up front what your criteria are (e.g. lazy people will use the stdlib so only test that; or nobody sorts non-integer types so I only test those)

For some inputs if you're willing to use a specialist sort the best option today is C. If you care enough to spend resources on specialising the sort for your purpose that's a real option. Or alternatively if you can't be bothered to do more than reach for the standard library of course Rust has significantly faster sort (stable and unstable) than any of the three C++ stdlibs. Or maybe you want a specialized vector sort that Intel came up with and they wrote it for C++. Hope portability wasn't an issue 'cos unsurprisingly Intel only care if it works on Intel CPUs.

> can just specialise by hand to get the same results

Sure, if you write all the code. If you're writing a library or more generic functions, you don't have that power.

And, even then, while you can do this it's going to be much more code and more prone to bugs. C++ is complex, but that complexity can often bring simplicity. I don't need to specialize for int, double, float, etc because the compiler can do it for me. And I know the implementation will be correct. If I specialize by hand, I can make mistakes.

In addition, this isn't something where C "shines". You can do the exact same thing in C++, if you want. Many templates have hand-rolled specializations for some types.

> apples vs apples

It is, they're both qsort. When every single comparison requires multiple dereferences + a function call it adds up.

> For some inputs if you're willing to use a specialist sort the best option today is C

I don't understand how. Even if this is the case, which I doubt, you could just include the C headers in a C++ application. So, C++ is equally as good of a choice + you get whatever else you want/need.

> Rust has significantly faster sort (stable and unstable) than any of the three C++ stdlibs

Maybe, but there's a new std::sort implementation in LLVM 17. Regardless, the Rust implementations are very fast for the same reason the C++ implementations are fast - encoding information in types at compile-time and aggressively inlining the comparison function. Rust has a very similar generic methodology to C++.

I am skeptical about this. Optimizer can also specialize functions and programmers can do too. Excessive specialization you get with templates always look beautiful in microbenchmarks but may not be ideal on a larger scale. There was a recent report analyzing the performance of Rust drivers vs C drivers and code bloat caused by monomorphization was an issue with the Rust things, and in my experience (also I do not have a reference) it is the same in C++.
> Optimizer can also specialize functions and programmers can do too

Yes, but not if you pass in void *. For libraries this matters. If you're both writing the producer and consumer then sure, you can do it manually.

> code bloat caused by monomorphization

This is true and a real problem, but I would argue in most scenarios extra codegen will be more performant than dynamic allocation + redirection. Because that's the alternative, like how swift or C# or Java do it.

I have written and worked on more than my fair share of Rust web servers, and the code is more than readable. This typically isn't the kind of Rust where you're managing lifetimes and type annotations so heavily.
> A better comparison would be with C++ or a similarly low-level language.

You probably want the apples-to-apples comparison but this looks an artificially limiting comparison; people are shilling, ahem, sorry, advocating for their languages in most areas, especially web / API servers. If somebody is making grandiose claims about their pet language then it's very fair to slap them with C++ or Rust or anything else that's actually mega ultra fast.

So there's no "better" comparison here. It's a fair game to compare everything to everything if people use all languages for the same kinds of tasks. And they do.

C# and Java are languages with very different performance ceilings and techniques available for memory management.
They are not saying every language will have same level of improvement as Rust, they are saying you can most of the improvements is available in most languages.

perhaps you get 1300MB to 20 MB with C# or Java or go, and 13MB with rust . Rust’s design is not the reason for bulk of the reduction is the point

Sure, but until people actually have real data that’s just supposition. If a Java rewrite went from 1300MB to, say, 500MB they’d have a valid point and optimizing for RAM consumption is severely contrary to mainstream Java culture.
I’m curious how Go stacks up against C# and Java these days.

“Less languages features, but a better compiler” was originally the aspirational selling point of Go.

And even though there were some hiccups, at least 10 years ago, I remember that mainly being true for typical web servers. Go programs did tend to use less memory, have less GC pauses (in the context of a normal api web server), and faster startup time.

But I know Java has put a ton of work in to catch up to Go. So I wonder if that’s still true today?

Go compiler is by far the weakest among those three. GC pause time is a little lie that leaves the allocation throttling, pause frequency and write barrier cost out of the picture. Go works quite well within its intended happy path but regresses massively under heavier allocation traffic in a way that just doesn’t happen in .NET or OpenJDK GC implementations.
You also have to think about your target audience.

Are you hiring developers that are 100% fully conscious of concurrency and starvation or people that are only concerned with rest and vest and TC?

For either case Go is better.

* For people that are aware of concurrency, they will select Go because they appreciate its out-of-the-box preemptive concurrency model with work stealing.

* For people that are not aware of concurrency, then you should definitely use Go because they are not qualified to safely use anything else.

That’s why I specifically qualified my comment “within the context of a typical crud api server”.

I remember this being true 10 years ago. Java web servers I maintained had a huge problem with tail latency. Maybe if you were working on a 1 qps service it didn’t matter. But for those of us working on high qps systems, this was a huge problem.

But like I said, I know the Java people have put a ton of work in to try to close the gap with Go. So maybe this isn’t true anymore.

You can't compare 10 years ago Java to current Go. 10 years ago was Java 8, we are currently on Java 23. The performance difference is massive between these 2 runtimes especially between the available garbage collectors.

Hazelcast has a good blog [0] on their benchmarks between 8 and some of the more modern runtimes, here is one of their conclusions:

> JDK 8 is an antiquated runtime. The default Parallel collector enters huge Full GC pauses and the G1, although having less frequent Full GCs, is stuck in an old version that uses just one thread to perform it, resulting in even longer pauses. Even on a moderate heap of 12 GB, the pauses were exceeding 20 seconds for Parallel and a full minute for G1. The ConcurrentMarkSweep collector is strictly worse than G1 in all scenarios, and its failure mode are multi-minute Full GC pause

[0] https://hazelcast.com/blog/performance-of-modern-java-on-dat...

Typical CRUD API server is going to do quite a few allocations, maybe use the "default" (underwhelming) gRPC implementation to call third-parties and query a DB (not to mention way worse state of ORMs in Go). It's an old topic.

Go tends to perform better at "leaner" microservices, but if you are judging this only by comparing it to the state of Java many years ago, ignoring numerous alternative stacks, it's going to be a completely unproductive way to look at the situation. Let's not move the goalposts.

Depends if you're measuring .net as written by members of the core team with all the tricks and hacks or .net as written by everyone else
> “Less languages features, but a better compiler” was originally the aspirational selling point of Go.

A faster compiler was the aspirational selling point. As legend has it, Go was conceived while waiting for a C++ program to compile.

Before what was called "Go 2" transitioned the project away from Google and into community direction there was some talk of adding no more features, instead focusing on improving the compiler... But since the community transition took place, the community has shown that they'd rather have new features.

The "Go 1" project is no longer with us (at least publicly; perhaps it lives on inside Google?)

One of the big draws of go is ease of deployment. A single self contained binary is easy to package and ship, especially with containers.

I don’t think Java has any edge when it comes to deployment.

Java AOT has come a long way, and is not so rare as it used to be. Native binaries with GraalVM AOT are becoming more a common way to ship CLI tools written in JVM languages.
Native image continues to be relegated to a “niche” scenario with very few accommodations from the wider Java ecosystem.

This contrasts significantly with effort and adoption of NativeAOT in .NET. Well, besides CLI, scenarios where it shines aren’t those which Go is capable of addressing properly in the first place like GUI applications.

It's hard to compare Rust or C++ to GC langs like C# and Java because their runtimes are greedy. The CLR will easily take 10x more memory than it's currently using such that future allocations are much, much faster. So measuring the memory consumption of a JVM/CLR application is not simple. You need to ask the GC how much memory you're actually using - you can't just check the task monitor.

Also you can do that same thing in Rust or C++ too. Very common in C++, speeds up programs quite a bit.

> The CLR will easily take 10x more memory

CoreCLR itself doesn't take much memory - GC might decide on a large heap size however. Do give .NET 9 a try with Server GC which has enabled DATAS by default. It prioritizes smaller memory footprint much more heavily and uses a much more advanced tuning algorithm to balance out memory consumption, allocation throughput and % of time spent in GC.