| > I feel like we don’t need a novel language for this. Well, there's like 50 others, many of whom are in heavy use in industry, so, I guess I feel like a lot of people think this is useful. . > I feel like Erlang is already a great language — almost a DSL — for “making complex finite-state machines easy to create.” Erlang is almost my favorite language. I've tried using both of its finite state machine libraries. I do not find them to be usable. The results are extremely verbose, and if one goes over around 30 states, I cease to be able to debug them. The natural implementation of the TCP/IP state machine in FSL is eight lines; 315 bytes. The TCP/IP Erlang state machine example floating around is like ten times that size. I find debugging that thing exhausting. . > It’s very easy to write a static analysis pass to verify that a primitive Erlang module like this constitutes an FSM “and no more” It's not clear to me why you would want this . > Honestly, ignoring all the stuff about concurrency, Erlang source code / BEAM bytecode is an almost perfect abstract machine for reifying various computational formalisms It can't implement anything without Okasaki juggling. You and I are in strong disagreement here. FSMs are a great example. You either need to reach outside the Beam VM with `hipe_bifs:array` or you need to copy the entire machine state every time you want to mutate it. . > And, if I were implementing something like Cloudflare’s Edge Workers “but for FSMs”, it’d be an extremely easy decision to standardize on (a restricted, static-analyzed at submit-time sub-ISA of) BEAM bytecode. At Cloudflare's scale, the inability to implement mutably would become a serious performance problem. |
I'm not talking about gen_fsm or gen_statem — note how I said "without OTP" above. I'm talking about writing Erlang the way it was originally conceived before proc_lib existed — where each process has a module that's exclusively responsible for its own receive loop, rather than being a delegate module for a generic receive-loop manager framework (proc_lib).
That kind of Erlang code produces simple, easy-to-analyze bytecode. It doesn't integrate well with supervisors or whatever, but interoperation with the rest of the Erlang/OTP ecosystem isn't the point. You don't call into arbitrary slap-dash non-formalized libraries within code that's supposed to represent a mathematical formalism. (E.g. you wouldn't call an arbitrary library from within a PEG grammar DSL. This is why e.g. Yecc's Erlang callbacks happen after parsing, rather than instead of parsing. If they happened instead of parsing, you'd break the formalism of the parser, and wouldn't be able to trust that your grammar does in practice what it was proven to do in theory any more.)
Instead, you create your own little world inside the formalism, that maybe imports a few reusable pre-proven abstractions compatible with the formalism, or that opaquely maps into pre-proven abstractions that exist within the prover (e.g. Z3's bit-vectors.) Abstractions are available in most-any other interpreter / prover / generator / whatever-er for the same formalism.
> FSMs are a great example. You either need to reach outside the Beam VM with `hipe_bifs:array` or you need to copy the entire machine state every time you want to mutate it.
Why are you mutating something outside the FSM from within the FSM? If the only state isn't "the state" (i.e. what box you're in on the FSM diagram), then what you have is not an FSM (in the computability-theory sense) any more.
The point of an FSM is to reduce the "power" of reasoning needed to prove things about the abstract machine — and so to be able to make guarantees that Turing machines cannot make, like being able to prove termination — by constraining what side-effects an operation within the abstract machine can have. An FSM can only switch its state among one of a usefully-enumerable number of options, e.g. the set of functions in a module, or the possible values of a single machine-register. A pushdown automaton can only switch its state among the sets of all possible sequences of such usefully-enumerable options (i.e. stacks of FSM states.) Etc.
(Note that Erlang/OTP's gen_fsm and gen_statem aren't FSMs in the computability-theory sense, since they pass along a state variable. They can be used as such if you don't put anything in that variable — or even if you put a scalar in that variable and constrain its size — but even if you don't use it, the resulting bytecode still passes that variable along in a way that makes it harder to translate the bytecode into prover lemmas. But all the remote-call stuff to proc_lib code that's not part of the proof already makes that nigh-on impossible, so....)
I think we might be talking about very different things here but both calling them "FSMs." I'm talking about how Erlang's native syntax is really good at expressing https://en.wikipedia.org/wiki/Deterministic_finite_automaton (like non-backtracking regexps) succinctly. What are you talking about?