Hacker News new | ask | show | jobs
by georgeplusplus 2369 days ago
To me, assembly is one of the things while self studying CS that I felt lacked good support in resources such as this, MOOCs, or just plain explaining it. Usually when I post a topic trying to Demystify the topic I am greeted with an extremely hard to digest read about said topic that is more meant for people Already knowledgeable in the subject.

And I feel assembly should be more a core building skill in a programmers toolbox. So this article is very welcoming for o me.

4 comments

Modern assembly is kind of...bad.

I would recommend checking out an old book for an old mainframe's assembly language. They're usually much less mystic by virtue of being much less complex. IBM had some really nice manuals and books; no one ever got fired for buying IBM because an IBM machine could be programmed by a dog.

Octal is where it's really at, though, if you get really into this. A fun weekend project is to write an octal "decompiler" (ideally you won't have compiled anything, just having written some octal by hand) that allows you to reason with what it's doing by translating it to an actual language rather than just thin syntactic sugar over 1s and 0s. Octal itself isn't so difficult, it makes binary much easier to reason with, but this definitely helps you get a more intuitive sense of what is what.

Of course, it's not something that has a substantial amount of value with modern machines. Maybe eventually we'll get back there; I think I'll enjoy it when we do. Until then, though, it's fun to play with.

> Octal is where it's really at

Why octal, not hex?

Octal is easier to keep in your head while not sacrificing any efficiency.
If you really want to dig into the concepts behind assembler, the assembler language part of TAOCP is available for free download: http://mmix.cs.hm.edu/doc/fasc1.pdf. It doesn’t talk about a “real” assembler language like 6502 or 8086, but a made-up one that was designed to present the concepts. It’s easy to move from Knuth’s academic introduction to an actual assembler.
I actually want to learn it to work on reverse engineering projects.

I just don’t know how to get started at all.

I don’t know how people can reverse engineer a device that you don’t access to the running program to. How do you monitor and track all the bits being passed around to break back firmware? Specifically video game mods and hacks I wanted to dabble in since I find their programming fascinating and know I’d be interested to contribute most in my spare time in that.

> video game mods and hacks I wanted to dabble in

Not sure you need assembly for that.

If you want to modify 3D rendered output, you normally need to adjust shaders, textures and such. For extreme cases, you can hook the entire Direct3D API adjusting how it works for the game. The only assembly you might need for that is shader assembly https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/... but not always necessary as the HLSL decompilers are often OK.

If you want to modify game logic, it’s normally implemented as scripts. Game designers and level designers don’t often know C++, and they certainly don’t want to recompile the game because it’s slow, they adjust scripts and see the result in real time.

I know and understand C++ as it’s the main language I’ve been working in for some time.

How do I modify the code of that which I don’t have access to?

What reverse engineer projects are good for beginners? I see people post here their first project attempt to reverse an older gadget. I’d love to pick up an older gadget and try to reverse engineer it and make it do what I want it to.

> How do I modify the code of that which I don’t have access to?

Native code reverse engineering is very time consuming. It’s often possible to achieve similar results by focusing on the code which you have access to. You don’t have source code of Windows OS components, but you do have their APIs and debug symbols, and that’s much better than just binaries.

If you want to change what’s rendered, you can replace the GPU API with a wrapped version, like renderdoc does. If you want to change what’s loaded from disk, patch game files, or replace whatever OS file I/O APIs is used by the game (DLL injection, then MinHook or Detours).

Even when you do need to change game’s own native code, directly patching machine code is rarely a good idea, very hard to implement and especially debug. An easier way is replacing complete functions with API-compatible replacements implemented in your DLL library in C++. Again, use MinHook or Detours to replace the implementation. C++ allows unrestricted memory access so you can read and write everywhere, here’s working examples: https://github.com/Const-me/vis_avs_dx/blob/master/avs_dx/Dx... https://github.com/Const-me/vis_avs_dx/blob/master/avs_dx/Dx... I didn’t have source code of these C++ classes, but wanted their data regardless. Found the offsets by using VS debugger, these third-party DLLs include GUI to change the values, I compared memory before/after making changes.

> What reverse engineer projects are good for beginners?

In the context of modern Windows games, assuming you wanna change what’s rendered, a good start might be https://renderdoc.org/. Officially, the tool is only supported when you run your own code. Technically, it often works with retail games too, just don’t open issues about that, they’ll be closed as a not supported use case. As a nice side effect, you’ll learn a thing of 2 about Direct3D. The tool is open source with good license (MIT), so you can fork, disable their frame captures, and change their API wrappers to modify the output of some particular game.

One more thing, modern games use a lot of bytecodes. E.g. D3D shaders are byte code, search “3dmigoto decompiler” to decompile dxbc into HLSL. .NET is often byte code (Unity3D is based on .NET), use reflector to decompile into C#. Many games use custom VMs, sometimes modding community has decompilers for their custom byte code.

> I’d love to pick up an older gadget and try to reverse engineer it

What do you mean by “gadget”?

Depends on the platform. Older platforms like the NES or SEGA Genesis often had software written in ARM - there are huge communities around modifying these games.
Good point. Yeah, for old games like NES, Genesis, or DOS, you don’t often have other choice.
I only say this because I myself am a girl who learned 68k ASM when I was ten or eleven through the Sonic ROM hacking scene :)
I got a fairly good introduction to assembly with LC-3 (Little Computer 3, an instruction set for learning) programming in an elementary electrical engineering class in college. I haven't looked myself, but for those self studying, searching "LC-3" might be a good option for self-learning assembly.