Hacker News new | ask | show | jobs
by ddulaney 436 days ago
Unfortunately, operator[] on std::vector is inherently unsafe. You can potentially try to ban it (using at() instead), but that has its own problems.

There’s a great talk by Louis Brandy called “Curiously Recurring C++ Bugs at Facebook” [0] that covers this really well, along with std::map’s operator[] and some more tricky bugs. An interesting question to ask if you try to watch that talk is: How does Rust design around those bugs, and what trade offs does it make?

[0]: https://m.youtube.com/watch?v=lkgszkPnV8g

1 comments

Thank you for sharing. Seems I still have more to learn!

It seems the bug you are flagging here is a null reference bug - I know Rust has Optional as a workaround for “null”

Are there any pitfalls in Rust when Optional does not return anything? Or does Optional close this bug altogether? I saw Optional pop up in Java to quiet down complaints on null pointer bugs but remained skeptical whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized

It's not so much Optional that deals with the bug. It's the fact that you can't just use a value that could possibly be null in a way that would break at runtime if it is null - the type system won't allow you, forcing an explicit check. Different languages do this in different ways - e.g. in C# and TypeScript you still have null, but references are designated as nullable or non-nullable - and an explicit comparison to null changes the type of the corresponding variable to indicate that it's not null.
I think sum types in general and Option<T> in particular is nicer. But the reason C# has nullability isn't that they disagree with me, it's that fundamentally the CLR has the same model as Java, all these types can be null, even though in the modern C# language you can say "No, not null that's never OK" at runtime on the CLR too bad maybe it's null.

For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.

In real C# apps written by an in-house team this isn't an issue, Ollie may not be the world's best programmer but he's not going to figure out how to explicity call this API with a null, he's going to be stopped by the C# compiler diagnostic saying it needs a Goose, and worst case he says "Hey tialaramex, why do I need a Goose?". But if you make stuff that's used by people you've never met it can be an issue.

> For example if I write a C# function which takes a Goose, specifically a Goose, not a Goose? or similar - well, too bad the CLR says my C# function can be called by this obsolete BASIC code which has no idea what a Goose is, but it's OK because it passed null. If my code can't cope with a null? Too bad, runtime exception.

That's actually no different to Rust still; if you try, you can pass a 0 value to a function that only accepts a reference (i.e. a non-zero pointer), be it by unsafe, or by assembly, or whatever.

Disagreeing with another comment on this thread, this isn't a matter of judgement around "who's bug is it? Should the callee check for null, or the caller?". Rust's win is by clearly articulating that the API takes non-zero, so the caller is buggy.

As you mention it can still be an issue, but there should be no uncertainty around who's mistake it is.

The difference is that C# has well-defined behavior in this case - a non-nullable notification is really "not-nullable-ish", and there are cases even in the language itself where code without any casts in it will observe null values of such types. It's just a type system hole they allow for convenience and back-compat.

OTOH with Rust you'd have to violate its safety guarantees, which if I understand correctly triggers UB.

> which if I understand correctly triggers UB.

Yes, your parent's example would be UB, and require unsafe.

Rust’s Optional does close this altogether, yes. All (non-unsafe) users of Optional are required to have some defined behavior in both cases. This is enforced by the language in the match statement, and most of the “member functions” on Optional use match under the hood.

This is an issue with the C++ standardization process as much as with the language itself. AIUI when std::optional (and std::variant, which has similar issues) were defined, there was a push to get new syntax into the language itself that would’ve been similar to Rust’s match statement.

However, that never made it through the standardization process, so we ended up with “library variants” that are not safe in all circumstances.

Here’s one of the papers from that time, though there are many others arguing different sides: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p00...

> whether or not it was better to design around the fact that there could be the absence of “something” coming into existence when it should have been initialized

So this is actually why "no null, but optional types" is such a nice spot in the programming language design space. Because by default, you are making sure it "should have been initialized," that is, in Rust:

  struct Point {
      x: i32,
      y: i32,
  }
You know that x and y can never be null. You can't construct a Point without those numbers existing.

By contrast, here's a point where they could be:

  struct Point {
      x: Option<i32>,
      y: Option<i32>,
  }
You know by looking at the type if it's ever possible for something to be missing or not.

> Are there any pitfalls in Rust when Optional does not return anything?

So, Rust will require you to handle both cases. For example:

    let x: Option<i32> = Some(5); // adding the type for clarity

    dbg!(x + 7); // try to debug print the result
This will give you a compile-time error:

     error[E0369]: cannot add `{integer}` to `Option<i32>`
       --> src/main.rs:4:12
        |
    4   |     dbg!(x + 7); // try to debug print the result
        |          - ^ - {integer}
        |          |
        |          Option<i32>
        |
    note: the foreign item type `Option<i32>` doesn't implement `Add<{integer}>`
It's not so much "pitfalls" exactly, but you can choose to do the same thing you'd get in a language with null: you can choose not to handle that case:

    let x: Option<i32> = Some(5); // adding the type for clarity
    
    let result = match x {
        Some(num) => num + 7,
        None => panic!("we don't have a number"),
    };

    dbg!(result); // try to debug print the result
This will successfully print, but if we change `x` to `None`, we'll get a panic, and our current thread dies.

Because this pattern is useful, there's a method on Option called `unwrap()` that does this:

  let result = x.unwrap();
And so, you can argue that Rust doesn't truly force you to do something different here. It forces you to make an active choice, to handle it or not to handle it, and in what way. Another option, for example, is to return a default value. Here it is written out, and then with the convenience method:

    let result = match x {
        Some(num) => num + 7,
        None => 0,
    };

  let result = x.unwrap_or(0);
And you have other choices, too. These are just two examples.

--------------

But to go back to the type thing for a bit, knowing statically you don't have any nulls allows you to do what some dynamic language fans call "confident coding," that is, you don't always need to be checking if something is null: you already know it isn't! This makes code more clear, and more robust.

If you take this strategy to its logical end, you arrive at "parse, don't validate," which uses Haskell examples but applies here too: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...