Hacker News new | ask | show | jobs
by zodiac 1544 days ago
Using the interpreter example in the article, would it be accurate to say that the author's inefficient "each instruc­tion copies the existing bytes into a new array, change the byte of interest, return this new byte array, and abandon the old array (which would even­tu­ally be garbage-collected)" is "manipulating the state directly"?
2 comments

Yes, and also not a data structure that is suitable for the task. If you want to keep every version of the buffer as a value (which has nice properties like versioning) you would need a data structure that shares most of its structure with the previous versions. Like a persistent radix-n tree. Slower than mutable buffer for sure, but not pathological.
Got it. I'm familiar with persistent data structures but I was more curious as to the "Haskell approach" of doing the same problem (which, I assume, doesn't use persistent data structures per se).

Follow-up question: if you wanted to implement the interpreter (as in the author's article) in idiomatic Haskell, would using the State monad be one solution? (I assume that's what you meant by "compose functions that manipulate that state"). Then, if you translated that Haskell directly into Racket (something like: use the exact same functions, just delete the type annotations), what would be the run-time of the solution?

(Not a trick question, I really don't know the answers. I've "understood" how the State, Reader, Writer work in the sense of reading the code, checking the types, and confirming that they do what they claim, but didn't think about how a compiler would compile them before)

Translating naively, in Haskell I might use https://hackage.haskell.org/package/containers-0.6.5.1/docs/..., which does use sharing to reduce copying.
A simple Brankfuck interpreter in Haskell using state monads may be found at https://tromp.github.io/cl/Binary_lambda_calculus.html#Brain...
Yeah, for something where changes are local like that, using two stacks for your tape makes a lot of sense.
Not sure there's any benefit in using the state monad for an interpreter.

My first thought is to model a cycle of the interpreter as a function of old state into new state.

I guess the state monad could be used to thread the state dataflow through the interpreter, but it would not make the copy-big-buffer any more efficient. Code using the state monad is still pure and uses immutable data.

Now if you have to use a mutable buffer (only one fits in memory?) then you could use techniques like the IO monad to deal with it.

Disclaimer: Am more Clojure than Haskell person.

Writing efficient interpreters in a functional style is not easy. I write interpreters in OCaml for a living and we had to switch to a mutation-heavy style for performance and memory usage reasons. It can be done though.

When you think about it, any dependently typed language has to have an interpreter inside it, because you can put function applications inside a type and the typechecker will probably need to evaluate the application at some point. The go-to technique is to use Normalization by Evaluation [0], where you convert the AST version of a function in the interpreted language to an actual function in the host language, and piggyback off the host language's (presumably very fast) function application.

[0]: https://en.wikipedia.org/wiki/Normalisation_by_evaluation

I suspect you already know this, but other readers might be interested to know that in Haskell you can also use the ST monad which can do local (actual) mutation which is guaranteed to not escape the function's scope.

(Of course you can only access the actual local state and not do I/O or anything like that.)

Sure, but there are many kinds of interpreters!

I was involved with one that wasn't really demanding in terms of computation, but was distributed in a way that required a lot of speculation and rollbacks. The functional approach was pretty sweet.

Highest speed bytecode interpreter is actually one of the few loops where I prefer assembly to C, as the instruction mix, branch prediction and even the use of stack can be weird from the PoV of a modern optimizing compiler.

Isn't there a way to derive efficient/in-place interpretation from pure total recursive functions ?

reading lisp in small pieces, the author migrates interpreters from naive recursive eval to bytecode. It felt there was an underlying model to extract.

I am also in the compiler/interpreter business. Can you tell me more about what you are working on? It sounds interesting.
No, its not manipulating the state, which is why it is inefficient.

An example of what parent was talking about is instead of looking over a list 10 times calling different functions that build a new result list each time, you compose those inti one big function that visits each element once.