Hacker News new | ask | show | jobs
by fluffything 2272 days ago
That example does not work, because declaring `state` multiple times creates an illegal C++ program (redeclaration of local variable - notice that this is not the case in Rust).

You need to declare variables with different names:

    const auto state0 = newState();
    const auto state1 = next(state0);
    const auto state2 = next(state1);
    const auto state3 = next(state); // TYPO -> BOOM use after move
I don't think this can be implemented safely in C++ without creating a "moved from" state that terminates the program on use, because C++ does not have Affine or Linear types.

That is, you can't use an `enum class`, since you can't implement move constructors and destructors for it, so you need to use a `variant` wrapper or similar:

    struct Color {
      struct Green {};
      struct Red {};
      struct Blue {};
      struct MovedFrom {};
      using data_t = variant<Green, Red, Blue, MovedFrom>;
      data_t data; 

      Color(Color&& other) {
        data = other.data;
        other.data = MovedFrom{};
      }
      //... another dozens lines of boilerplate...
    };

and you can't probably use variant either, since using variant would introduce yet another possible state (e.g. if an exception gets thrown..).

So doing this right on C++ probably requires 100s of lines of boiler plate, it probably requires run-time state to keep track of moved-from values to enforce that states that have already been used are not used anymore, etc.

At this point you might as well just write that part of your code in Rust, where `enum Color { Gree, Red, Blue }` just works and will do what you want without any run-time state. If you need to do compile-time computations, you can either use nightly and use const generics, or you can use stable Rust and write a proc macro. Both options are easier for humans to get right than the amount C++ boilerplate that's going to be required to avoid the fact that move operations are not "destructive" / affine.

Another user below was arguing that they preferred to use C++ because there they don't need to use `{ }` to disambiguate const generics, yet they are apparently fine with using `var.template member_fn` to disambiguate all template method calls... I imagine many users will argue that writing all the boilerplate above is "fine" or "not a big deal". To me all this sounds like Stockholm syndrome: somebody must use C++, they have been using it for 10 years already, and having to write all these boilerplate and know all these detail nitpicks of trivia to write a trivial piece of code makes them feel clever and gives them job security. I'm not even going to read your comments so really don't bother replying if that's what you are going to talk about.

2 comments

Nobody is using state machines to advance several times through the states with variables named "stateN", so I am not sure what is the point.

There is no "BOOM" either since "use after move" is not a safety concern for those empty types, just a logic bug, which will likely appear at compile-time since your template specialization would not match your expectations.

The redeclaration in Rust always makes me uneasy as a default. It would have been better to require special syntax.

The rest about C++ users looks like flamebait to me.

> Nobody is using state machines to advance several times through the states with variables named "stateN", so I am not sure what is the point.

The point is that in C++ every time you advance the state you "split" the state machine into two - one that can be used by mistake and doing so introduces a bug, and one that is the one that should be used.

In programming languages that proper support state machines (or session types, or any similar pattern), that split is guaranteed to be impossible, so you get the guarantee that users cannot misuse your API, because attempting to do so is a guaranteed compilation error.

> There is no "BOOM" either since "use after move" is not a safety concern for those empty types, just a logic bug, which will likely appear at compile-time since your template specialization would not match your expectations.

This isn't true: even if `state0` and `state1` have different types, as you are proposing, your proposed `next` function accepts both types according to your design without a compilation error.

There is "no" fix for this in C++. Even if you were to introduce `next0`, `next1`, etc. that only accepts one type, the one of `state0`, `state1`, etc. that would create a compilation error here:

    auto state0 = next(initial_state);
    auto state1 = next0(state0); 
    auto state2 = next0(state1); // ct-error: use next1(state1)
but the underlying error is still there, and that is that the user can write

    auto state2 = next0(state0); // use-after-move
that's a logic error that Rust catches at compile-time, but C++ would need to catch at run-time, and catching this at run-time adds overhead, since now you need to store in some run-time data-structure in which state the state machine is, to be able to verify these things (while in Rust, you don't have to track this at run-time at all).

> There is no "BOOM" either since "use after move" is not a safety concern for those empty types, just a logic bug

Rust allows you to assume that this logic bug never happens. C++ code that assumes this can easily have undefined behavior due to the logic bug happening. That is, C++ code cannot assume that the state machine will only go from one state to the next, at least, without the whole state machine library / implementation checking at every step that these bugs do not happen, and, e.g., terminating the program if that's the case That's a valid solution, and probably the best solution that can be implemented in C++, but compared to what Rust and other languages offer, it is a very bad solution and the consequences are quite drastic (state machines, session types, etc. are widely used in Rust to design APIs, while they aren't really used in C++ because they are very boiler plate heavy, complex to implement, and incur a lot of runtime overhead to prevent these errors).

> The redeclaration in Rust always makes me uneasy as a default. It would have been better to require special syntax.

How many years have you been a full-time Rust user ? Or how many of your C++ projects use the "state machine C++ pattern" that you are advocating here ? How many developers are involved in each of those projects ?

You claimed there is a safety issue on "use after free" for empty types which are trivial. I am still waiting to see the proof.

You also keep saying there is no way to fix in C++, you can most definitely make those into compilation errors. And that is if you insist on advancing several states and calling them "stateN" is useful, which I have never done in my life.

Then the last paragraph is another flamebait plus getting into arguments of authority.

> You claimed there is a safety issue on "use after free" for empty types which are trivial. I am still waiting to see the proof.

I claimed that it is trivial to accidentally introduce safety issues when implementing _state_ machines in C++ like you are proposing. The "state" in "state machines" comes from the machine actually having some state. Naive state machines don't store state, and simple state machines can encode all their state in types, but real world state machines rarely do so (e.g. a regex engine).

> You also keep saying there is no way to fix in C++, you can most definitely make those into compilation errors.

Show how to do that then, e.g., for example for a simple file handle wrapper, that only allows reading a file once:

    struct FileHandle {
      static FileHandle open(const char*);
      ~FileHandle(); // closes file
      struct FileRead { ~FileRead(); /* closes file */ };
      static FileRead read(FileHandle);
    };
such that the file is not closed twice:

   auto file = FileHandle::open("foo");
   auto read = FileHandle::read(file);
   //~ read destructor closes file
   //~ file destructor double-closes file
without doing any run-time checks, e.g., in destructors, that check whether the file is closed.

This is easy peasy in Rust.

> and you can't probably use variant either, since using variant would introduce yet another possible state (e.g. if an exception gets thrown..).

Variants are excellent for a state machine. valueless_by_exception is pretty much irrelevant if your states' relevant constructors and assignments are nothrow.

C++ variants are horrible for state machines, since they are not affine types.