Hacker News new | ask | show | jobs
by kevingadd 4844 days ago
The ILSpy team's hard work is part of what made it possible for me to write my .NET -> JS compiler (http://jsil.org/). My ~120k LoC wouldn't work without their ~450k LoC (well, I don't consume all 450k...)

ILSpy is a pretty interesting application/library to look at under the hood. The decompilation logic that transforms .NET bytecode (MSIL) into higher-level data structures is split into a bunch of well-defined transform phases that run in a pipeline, which means you can actually step through the pipeline and watch it improve the readability and semantic clarity of the IL one step at a time. It's an incredibly valuable debugging tool and really useful for understanding how this kind of decompiler works, and it was a big influence on how I ended up designing the similar parts of my compiler.

As a whole, I think ILSpy demonstrates just how valuable it is to have a really well specified instruction set sitting beneath your compiler and runtime. MSDN's documentation for the instruction set is clear and understandable and libraries like Cecil and ILSpy make it easy to load, manipulate, and save for whatever your purposes might be - runtime code generation, machine transforms of compiled code, obfuscation, deobfuscation, or outright cross-compilation.

1 comments

Interesting, so the benefit there is generating higher-level JavaScript instead of interpreting IL at a lower level?
Yeah. I consume the munged IL that comes out of their transform pipeline (though for complex reasons, I don't use all of it - some of their transforms are destructive in ways that aren't helpful, or I'd have to undo them) which saves me the trouble of reimplementing things they already figured out, like how to transform most branch/jump patterns into if statements and while loops.

I could generate JS from raw IL (and other projects like Volta did just that) but ILSpy gives me a huge head start in terms of producing JS that actually looks like what you'd write by hand. For loops instead of while loops, switch statements instead of cascading ifs, etc.

Volta, from the demo I used a long time ago, seemed horrendously slow, too. JSIL feels far faster. I guess you gain a bit of performance by making the code higher level so the JS engines can tell if an optimization is safe.

Is JSIL limited to a subset of IL? Can you target C++ (in pure mode) to it? Opcodes like cpblk, and others?

Volta's approach seems to be essentially implementing a low level .NET runtime and representing the bytecode as JS. It gives you some cool stuff for free (for example, that low level approach means their type system works almost exactly like .NET's universally) but it does indeed mean that you have to work harder to give JS engines an opportunity to optimize your code. It makes integration tougher, too. In comparison, the best way to describe JSIL's approach is trying to express .NET concepts on top of JS, so it uses JS standard libraries and types wherever possible.

JSIL is theoretically limited in that there are things expressible in IL that you simply can't do in a browser. However, out of all the executables I've run the compiler on so far, very little of the IL they contain is actually impossible to translate - the tricky patterns and opcodes seem to get used only occasionally in one or two methods.

Some parts are definitely harder than others; I've only recently gotten support for pointers and the 'unsafe' class of C# features working: http://jsil.org/try/#5055026 and that's only covering a subset of all the different opcodes defined for doing interesting things with pointers and references. For example, function pointers will probably never work, and IIRC there are a few opcodes dedicated to interacting with those.