Hacker News new | ask | show | jobs
by diffxx 729 days ago
Not the OP, but for me one of the main lessons from Forth is that the boundary between high and low level code can be as thin as you make it. With Forth, you can define high level words using low level primitive words. You then weave these low level words together at a high level. After you have written a correct solution at a high level, bottlenecks can be optimized out by introducing new low level words. It's like you are solving a problem by designing an instruction set specifically for that problem. But unlike a hardware ISA, you can add new instructions tailor made for the specific problem you are presently solving.

Paul Graham has described lisp as a tool for writing fast programs fast (that's the gist at least). IIRC, he clarifies that each fast is a phase. Lisp allows you to write a fast, high level, prototype that might have poor runtime performance and then you refine the prototype so that the compiler can generate fast machine code.

Forth is similar but feels closer to the machine than Lisp because of the stack based threading model. In Forth, you often don't need manual memory management _or_ garbage collection and you can easily extend the system with new low level words. But it does require you to think differently about your program design so that it fits the stack based vm model.

Historically, Forth has been implemented in assembly but you can write a Forth that targets any host. The truly mind expanding thing is that you, an individual, can write a Forth compiler and you can also write it in such a way that it has multiple targets. For example, you can generate native, jvm and js targets from the same underlying source. Usually this can be done by translating the forth generated AST to source code for another high level language. This allows you to write a fast high level language quickly without getting bogged down in writing, say, fast x86 codegen or a garbage collector. To get started, you just target the host language with the best high level properties that you need (which will be problem and context specific). I don't enjoy writing c, but I have no problem translating to c if I need to for performance or for syscall access.

To illustrate the practical power of this. Suppose that you have an important subcomponent of your system that was written in, say, Node.js. It turns out that this is a critical bottleneck that cannot easily be optimized in javascript. You don't want to rewrite the whole system, but you do want to rewrite that component and have it seamlessly interoperate with the existing system. You could write a small Forth DSL for that subcomponent. This Forth will target javascript initially. You translate the subcomponent into Forth and reuse all of the existing tests (that were presumably also written in javascript). Then you rewrite the tests in Forth so that the entire subcomponent is now written in Forth. Now you write a new backend that translates to say, c or rust with node bindings. You can run the native implementation against a native implementation of the test suite since they're both written in high level Forth at this point. Then you can flip the switch between the native or javascript implementations and be confident that both implementations are identical because they have the same high level description.

Once you start seeing things this way though, you start realizing that you can write Forth style code in any language (and the reverse is also true). Forth is as much about a process for solving programming problems as it is a specific, concrete language. This is also why there is the old adage "once you've seen one Forth, you've seen one Forth."

2 comments

I didn't use FORTH long, maybe for a year, (back when 8080s were The Thing), but I remember my first reaction (after learning about Reverse Polish) to it: it was like using assembly language, but without the pain.

(Moved to 6502; don't recall having a version for that.)

I think a lot of recent Forths (at least outside of the embedded space) have been written in C or C++ rather than assembly now that the processors are incredibly complicated. Of course, you can write forth on anything including the JVM or in Python...etc.
I think jonesforth is the most popular implementation teaching implementation: https://github.com/nornagon/jonesforth/blob/master/jonesfort...

Factor might be a counterexample if one considers it a Forth, the VM is in part implemented in C++: https://github.com/factor/factor

I think it's portability and ease of development rather than CPU architecture complexity that makes someone pick C/C++ over assembly when implementing a Forth system. Because the Forth won't need much of the assembly language or obscure CPU instructions, the complexity of the architecture won't really matter to whoever is implementing it.

JonesForth is popular for teaching, but most implemented Forths people talk about on r/forth seem to only be written in ASM if the target is embedded. Portability, simplicity, and the complexity of modern processors just seem to make ASM less of a good option these days. I'm no forth expert though...mainly a lurker.
I don't know what "r/forth" is. Some conference?

I'm absolutely not an expert, just a long time enjoyer of some classic Forth books and certain concatenative languages.

r/forth is a subreddit - so https://<reddit url>/r/forth
OK, so a web forum then. I avoid Reddit, I don't believe in corporations as stewards of web forums.