Hacker News new | ask | show | jobs
by Const-me 2138 days ago
Never wrote virtual machines, because I know why Sun (now Oracle) and Microsoft both spent a billion each to create theirs. You can write something that works over a weekend, but the performance won’t be good without JIT, generational GC, and many other extremely complicated optimizations.

If you think you need to develop a VM, I recommend to reconsider, and think how you can reuse something that’s already there. For instance, modern .NET VM is open source with MIT license, the code quality is more or less OK, and it’s relatively easy to generate .NET bytecode from something else, Reflection.Emit from the inside, or Mono.Cecil from outside.

2 comments

The point of writing your own vm isn’t to come up something that is on par with Sun’s or Microsoft’s vm, but rather have a hands on learning experience of the inner workings of a vm.
Real-life VMs don’t interpret, they JIT compile. The code in the article has nothing common with inner workings of real-life VMs.

Even VMs which do interpret don’t do the way written in the article, take a look: https://github.com/python/cpython/blob/v3.9.0rc1/Python/ceva...

What exactly it is you’re learning then?

> Real-life VMs don’t interpret, they JIT compile.

Pretty much every real-life JavaScript VM is tiered, and has an interpreter, which gathers data about expected usage which the JIT will use to inform its optimizations when it goes to generate machine code.

Still, you'd be surprised about the performance you can get out of a basic interpreter. Games have used Lua for years. I've written and reverse engineered plenty of custom bytecode for various reasons in the games space. It's a useful tool to have, and there are a lot of situations where performance either isn't the goal, or the large amount of tricks used by JITting VMs isn't helpful.

The dispatch loop you point to is just about using a C extension (computed gotos) to gain a few extra performance points. You can learn about it in about 30 minutes after knowing what a VM is.

It's a learning experience, it doesn't have to be production quality.

It will teach you skills that are applicable in other contexts and give a deeper understanding of what makes a VM tick.

And sometimes, a tiny VM is just what you need.

I agree about learning experience. But when I need a tiny language, I usually implement them so they compile into something already implemented and supported. Not necessarily .NET bytecode, here’s an example where I compiled a tiny language into HLSL source for the Microsoft’s shader compiler: https://github.com/Const-me/vis_avs_dx/tree/master/avs_dx/Dx...
Yeah sometimes that works. I wanted to write a "compiler" for brainfuck recently. So my first step was converting a BF program to C, then using gcc to compile that.

I guess my actual compiler wasn't so different, just generated an assembly source-file then passed that through an assembler. But that kind of transpiling is pretty simple to get working, and if your destination language is fast enough, or optimized enough, then it'll work just fine.

https://github.com/skx/bfcc/

From my experience, writing your own gives you a level of control and leverage not achievable through a third party VM that might be worth it once you get some practice. As long as you haven't tried, you have no idea about the trade-offs, goes for anything.

Same goes for not compiling all the way down to byte code (or even worse, C), makes some things easier to implement and allows more flexibility because of the lack of separation between compile time and run time.