Hacker News new | ask | show | jobs
by viraptor 3566 days ago
Just curious, maybe someone has better idea if this is harder or easier than writing the whole CPU emulator. What about writing a compiler from 16b binary to 32b/64b binary? Specifically, translate the common calls to be as close to compiled native as possible. It seems to me that unless someone wrote some functions directly in asm, most code should be relatively easy to translate.

Pros: better performance, better memory usage, most instructions map 1:1, there's lots of extra registers for storing extra state required by the translation, easier debugging (you can compile each function separately and verify without running). Cons: harder debugging (at runtime), harder to test, properly translating memory segmentation, need to find a way to adjust all the offsets automatically, need to convert all the API function call conventions (can be quite tricky with variadic arguments).

1 comments

There's all sorts of weird stuff like 'thunks' that need to be taken into account. It would be faster but is considerably harder to write, and "faster" isn't much of a requirement for Win3 apps.

Whereas a straightforward emulator of the 16-bit set with the standard computed jump table is fairly easy. Mostly requires a lot of typing. If your interpreter fits in the i-cache it can even be faster.

Although apparently since WINE will run 16-bit code, it looks like preservation of the 16-bit world is a Solved Problem.

(I've long wanted a human-assisted decompiler for 16-bit DOS/Windows code, as a means of salvaging old games. For a few games people have done this by hand and built modernised versions with the bugs fixed.)

Ok, my (unmentioned) assumption was VC-created code with no added assembly. I realise that VB code for example would be stupid hard to convert this way.

I'm curious what do you mean by thunks in this case - specifically why would they be different than other code. Is it that the common code in original program may not be valid common code after translation?

There was some use of self-modifying code, although I can't find exact references. It'll be on Raymond Chen's blog somewhere.

I don't think it should matter whether the original source was VC or not, and if you write a solution that does make it matter you'll find yourself tripped up a lot. (Your favourite game turns out to have been compiled with Watcom or Borland rather than VC, etc). Presumably if you could find the 16-bit VB interpreter bits and pieces you could run those too and run your VB app.

I did run across this blog post http://discuss.joelonsoftware.com/default.asp?joel.3.607174...., which could be turned into a Win10 complaint with a simple search-and-replace.

Some compatibility craziness, that isn't quite relevant to our win16 emulation case: https://blogs.msdn.microsoft.com/oldnewthing/20071224-00/?p=...

Thunks were used to execute 32 applications in 16 bit mode via Win32s, and were also the way you would prepare callbacks to be exposed to Windows APIs.
I was about to add a reference to the win32s thunking, and then chanced upon this [0] MSDN article that talks about supporting 32 bit I/O in 64 bit drivers. [0] https://msdn.microsoft.com/en-us/library/windows/hardware/ff...
> I've long wanted a human-assisted decompiler for 16-bit DOS/Windows code, as a means of salvaging old games.

If you're lucky (like me with EarthSiege 2), your game might also have a 32bit version, which you can run through IDA/Hexrays just fine.