| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by matklad 2243 days ago

Note that Rust and Go programs differ in a seemingly insignificant, but actually important detail.

Rust does this

    next_dist = std::cmp::min(
        dist_if_substitute,
        std::cmp::min(dist_if_insert, dist_if_delete),
    );

Go does this

    nextDist = min(
         distIfDelete, 
         min(distIfInsert, distIfSubstitute)
    )

The order of minimums is important for this dynamic programming loop. If I change Rust version to take minimums in the same order (swapping substitute and delete), runtime drops from 1.878696288 to 1.579639363.

I haven't investigated this, but I would guess that this is the same effect I've observed in

* https://matklad.github.io/2017/03/12/min-of-three.html

* https://matklad.github.io/2017/03/18/min-of-three-part-2.htm...

(reposting my comment from reddit, as it's a rather unexpected observation)

5 comments

ttd 2243 days ago

Those are two really well written articles -- taking a complicated topic and making it very accessible. Thanks! FWIW I think a companion article on how to effectively use perf for tasks like this would be a great addition, since it can be a bit novice-unfriendly.

ishanjain28 2243 days ago

I think they made the rust version same as Go because I cloned it just now and they are both the same. Also, Thank you soo much for the blog posts! :)

_____smurf_____ 2243 days ago

Given this information, and for general parsing functionalities, which one is faster, Go or Rust?

arcticbull 2243 days ago

As always, it depends on what your goals are. String processing is usually not the long pole when you're building something that consumes the output of a parser. Based on micro and macro benchmarks, Rust is typically substantially faster than Go and pretty much always uses less RAM [1].

But again, depends what you're doing with the output, and if these deltas even matter in your context.

[1] https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

fluffything 2243 days ago

You can write a parser as far as the limits of your imagination allow in Rust, or C, or any other baremetal language without a thick run-time.

In Go, beyond the limits of your imagination, you'll also hit other limits, like those of the garbage collector.

drfuchs 2243 days ago

Very few things have ever been measured accurately to ten significant digits. How much did these numbers change run to run? How many measurements did you take? Were the caches warmed up similarly? Still, point taken.

matklad 2243 days ago

The excessive "precision" is because I've just copy-pasted what the original bench printed.

As for "is this a reliable result", I believe I've performed diligence, appropriate for a HN comment, to make sure that this is not a completely random result. As I've said, I did not investigate this particular bit of code thoroughly. You are welcome to read the linked blog posts, which study the issue in depth.

londons_explore 2243 days ago

Since min(a, min(b,c)) == min(b, min(a,c)), perhaps the compiler should be smart enough to swap the comparisons around if it makes it quicker?

dan-robertson 2243 days ago

I suspect that statement is not true for floats. Possibly you don’t get the same float from min(0,-0) as min(-0,0), and similarly with NaNs. Rust specifies that if one input is NaN then the other is returned but doesn’t say what happens if both are NaN.

fluffything 2243 days ago

> Rust specifies that if one input is NaN then the other is returned but doesn’t say what happens if both are NaN.

It does. If both are NaN a NaN is returned. Note, however, that when Rust says that a NaN is returned, this means that any NaN can be returned. So if you have min(NaN0, NaN0) the result isn't necessarily NaN0, it might be another NaN with a different payload.

dan-robertson 2243 days ago

Right. That’s what I was getting at but I didn’t know the NaN-return rule.

gpm 2243 days ago

Hmm, not sure what the llvm semantics look like, but you're right about the assembly semantics, vminsd (the instruction used in the post) is unfortunately not symettric in it's handling on NaN.

https://www.felixcloutier.com/x86/minsd

londons_explore 2243 days ago

NaN's are rarely required on the fast path of any code... The compiler ought to make a 2nd slow implementation for if any NaNs are seen.

fluffything 2243 days ago

Sure, but then it needs to add a branch to detect the inputs, and maybe keep around some state to switch between implementations at run-time.

Its far from clear that doing that is, in general, worth it.

arcticbull 2243 days ago

Part of the problem may be they re-implemented `std::cmp::min` at the bottom of the file, I wonder if there's a more optimized version in the stdlib.

gpm 2243 days ago

One of the most frustrating parts of rust (for me), is that `std::cmp::min` (and some other methods) require that their arguments are `Ord` (totally ordered), and floats are only partially order because of NaN, so you can't use std::cmp::min on floats.

shepmaster 2243 days ago

That’s what https://doc.rust-lang.org/std/primitive.f32.html#method.min is for

arcticbull 2243 days ago

I do believe that's quite intentional, due to the inexactness of floating-point values, it's not "right" to do that. There should be a `partial_min` function though, IMO, which has weaker guarantees.

gpm 2243 days ago

It's absolutely intentional, and it's got being technically correct on its side, it's just frustrating.

oxnrtr 2242 days ago

Not really. That Rust forced total order and partial order into a sub-typing relation is a completely unnecessary, self-inflicted wound.

Arnavion 2243 days ago

You can use the ordered_float crate. It has a newtype that cannot hold NaNs and thus impls Ord.

steveklabnik 2243 days ago

Depending on what behavior you want, there’s a bunch of wrapper crates.

jfkebwjsbx 2243 days ago

Yeah, it makes no sense to be the default. Code that expects to treat NaNs is very rare.

arcticbull 2243 days ago

Safety is the whole idea behind Rust, and if you draw the line here, that's neither in line with the Rust ethos, nor particularly valuable. After all, code that "expects to handle" null was pretty rare too ;)

jfkebwjsbx 2243 days ago

NaNs have nothing to do with safety (same for nulls).

It is a common misconception to conflate safety with functional expectations. A program that only calls panic() is perfectly safe.