Hacker News new | ask | show | jobs
by Arnavion 710 days ago
I'm not sure what part of that is supposed to be a pain. The sans-io equivalent would be:

    handle_input(buf) -> Result {
        if len(buf) < 4 { return Error::IncompletePacket }

        len = toInt(buf[0..4])

        if len(buf) < 4 + len { return Error::IncompletePacket }

        packet = buf[4..(4 + len)]
        return Ok { packet: packet, consumed: 4 + len }
    }
where the semantics of `Error::IncompletePacket` are that the caller reads more into the buffer from its actual IO layer and then calls handle_input() again. So your "while not received required bytes: accumulate in buf" simply become "if len < required: return Error::IncompletePacket"
2 comments

I don't think that implementation is particularly good, although this is a big trick with Sans-IO: is the event loop responsible for buffering the input bytes? Or are the state machines?

In effect, you have to be thoughtful (and explicit!) about the event loop semantics demanded by each state machine and, as the event loop implementer, you have to satisfy all of those semantics faithfully.

A few alternatives include your version, one where `handle_input` returns something like `Result<Option<Packet>>` covering both error cases and successful partial consumption cases, one where `handle_input` tells the event loop how much additional input it knows it needs whenever it finishes parsing a length field and requires that the event loop not call it again until it can hand it exactly that many bytes.

This can all be pretty non-trivial. And then you'd want to compose state machines with different anticipated semantics. It's not obvious how to do this well.

Fair enough. So let's complicate it a little. If you have hierarchical variable sized structures within structures (e.g. java class file), then you need a stack of work in progress (pointers plus length) at every level. In fact, the moment you need a stack to simulate what would otherwise have been a series of function calls, it becomes a pain.

Or let's say you have a loop ("retry three times before giving up"), then you have to store the index in a recoverable struct. Put this inside a nested loop, and you know what I mean.

I have run into these situations enough that a flat state machine becomes a pain to deal with.

These are nicely solved using coroutines. That way you can have function related temporary state, IO-related state and stacks all taken care of simply.