Hacker News new | ask | show | jobs
by matklad 2243 days ago
Note that Rust and Go programs differ in a seemingly insignificant, but actually important detail.

Rust does this

    next_dist = std::cmp::min(
        dist_if_substitute,
        std::cmp::min(dist_if_insert, dist_if_delete),
    );
Go does this

    nextDist = min(
         distIfDelete, 
         min(distIfInsert, distIfSubstitute)
    )
The order of minimums is important for this dynamic programming loop. If I change Rust version to take minimums in the same order (swapping substitute and delete), runtime drops from 1.878696288 to 1.579639363.

I haven't investigated this, but I would guess that this is the same effect I've observed in

* https://matklad.github.io/2017/03/12/min-of-three.html

* https://matklad.github.io/2017/03/18/min-of-three-part-2.htm...

(reposting my comment from reddit, as it's a rather unexpected observation)

5 comments

Those are two really well written articles -- taking a complicated topic and making it very accessible. Thanks! FWIW I think a companion article on how to effectively use perf for tasks like this would be a great addition, since it can be a bit novice-unfriendly.
I think they made the rust version same as Go because I cloned it just now and they are both the same. Also, Thank you soo much for the blog posts! :)
Given this information, and for general parsing functionalities, which one is faster, Go or Rust?
As always, it depends on what your goals are. String processing is usually not the long pole when you're building something that consumes the output of a parser. Based on micro and macro benchmarks, Rust is typically substantially faster than Go and pretty much always uses less RAM [1].

But again, depends what you're doing with the output, and if these deltas even matter in your context.

[1] https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

You can write a parser as far as the limits of your imagination allow in Rust, or C, or any other baremetal language without a thick run-time.

In Go, beyond the limits of your imagination, you'll also hit other limits, like those of the garbage collector.

Very few things have ever been measured accurately to ten significant digits. How much did these numbers change run to run? How many measurements did you take? Were the caches warmed up similarly? Still, point taken.
The excessive "precision" is because I've just copy-pasted what the original bench printed.

As for "is this a reliable result", I believe I've performed diligence, appropriate for a HN comment, to make sure that this is not a completely random result. As I've said, I did not investigate this particular bit of code thoroughly. You are welcome to read the linked blog posts, which study the issue in depth.

Since min(a, min(b,c)) == min(b, min(a,c)), perhaps the compiler should be smart enough to swap the comparisons around if it makes it quicker?
I suspect that statement is not true for floats. Possibly you don’t get the same float from min(0,-0) as min(-0,0), and similarly with NaNs. Rust specifies that if one input is NaN then the other is returned but doesn’t say what happens if both are NaN.
> Rust specifies that if one input is NaN then the other is returned but doesn’t say what happens if both are NaN.

It does. If both are NaN a NaN is returned. Note, however, that when Rust says that a NaN is returned, this means that any NaN can be returned. So if you have min(NaN0, NaN0) the result isn't necessarily NaN0, it might be another NaN with a different payload.

Right. That’s what I was getting at but I didn’t know the NaN-return rule.
Hmm, not sure what the llvm semantics look like, but you're right about the assembly semantics, vminsd (the instruction used in the post) is unfortunately not symettric in it's handling on NaN.

https://www.felixcloutier.com/x86/minsd

NaN's are rarely required on the fast path of any code... The compiler ought to make a 2nd slow implementation for if any NaNs are seen.
Sure, but then it needs to add a branch to detect the inputs, and maybe keep around some state to switch between implementations at run-time.

Its far from clear that doing that is, in general, worth it.

Part of the problem may be they re-implemented `std::cmp::min` at the bottom of the file, I wonder if there's a more optimized version in the stdlib.
One of the most frustrating parts of rust (for me), is that `std::cmp::min` (and some other methods) require that their arguments are `Ord` (totally ordered), and floats are only partially order because of NaN, so you can't use std::cmp::min on floats.
I do believe that's quite intentional, due to the inexactness of floating-point values, it's not "right" to do that. There should be a `partial_min` function though, IMO, which has weaker guarantees.
It's absolutely intentional, and it's got being technically correct on its side, it's just frustrating.
Not really. That Rust forced total order and partial order into a sub-typing relation is a completely unnecessary, self-inflicted wound.
You can use the ordered_float crate. It has a newtype that cannot hold NaNs and thus impls Ord.
Depending on what behavior you want, there’s a bunch of wrapper crates.
Yeah, it makes no sense to be the default. Code that expects to treat NaNs is very rare.
Safety is the whole idea behind Rust, and if you draw the line here, that's neither in line with the Rust ethos, nor particularly valuable. After all, code that "expects to handle" null was pretty rare too ;)
NaNs have nothing to do with safety (same for nulls).

It is a common misconception to conflate safety with functional expectations. A program that only calls panic() is perfectly safe.