|
|
|
|
|
by isido
3874 days ago
|
|
This site and accompanying book (http://beginners.re/Reverse_Engineering_for_Beginners-en.pdf) seem a nice effort. Briefly reading the book, it seems that you need to be an intermediate or advanced beginner, or know something about assembly beforehand. Some terms (opcode, ISA) are not really explained (except in the glossary) before they are used and there are perhaps too much detail in the expense of the bigger picture. Criticisms aside, the book and the challenges seem interesting and the efforts of the author must be appreciated! |
|
The right way to learn this stuff is to dive in. You'll be over your head for a few hours, but you'll get your bearings. There are topics this technique doesn't work great with, but assembly reversing isn't one of them.
An additional benefit: assembly is one of those things that you might not use all the time in your career (although I've ended up using it quite a bit), but that will nonetheless illuminate lots of other things about computer science. There's a reason Knuth used it as a language to express algorithms in TAOCP.
I can sum up the core idea of assembly for you in just a few sentences:
* You're given 8-32 global variables of fixed size to work with, called "registers".
* Virtually all computation is expressed in terms of simple operations on registers.
* Real programs need many more than 8-32 variables to work with.
* What doesn't fit in registers lives in memory.
* Memory is accessed either with loads and stores at addresses, as if it were a big array, or through PUSH and POP operations on a stack.
* Memory is to an assembly program what the disk is to a Ruby program: you pull things out of memory into variables, do things with them, and eventually put them back into memory.
* Control flow is done via GOTOs --- jumps, branches, or calls.
* A jump is just an unconditional GOTO.
* Most operations on registers, like addition and subtraction, have the side effect of altering status flags, like "the last value computed resulted in zero". There are just a few status flags, and they usually live in a special register.
* Branches are just GOTOs that are predicated on a status flag, like, "GOTO this address only if the last arithmetic operation resulted in zero".
* A CALL is just an unconditional GOTO that pushes the next address on the stack, so a RET instruction can later pop it off and keep going where the CALL left off.
Everything else is just a detail.
When working with assembly, a lot of people will just get the programmer's reference manual for the instruction set in a PDF (they're published for free). Here's a short one for X86:
http://ref.x86asm.net/coder32.html
Here's ARM:
http://infocenter.arm.com/help/topic/com.arm.doc.qrc0001m/QR...
Here's AVR:
http://www.atmel.com/images/atmel-0856-avr-instruction-set-m...
Even if you'd never written a line of assembly, if I asked you to express the procedures of a simple Ruby program in assembly and gave you the instruction set reference and those sentences, you would figure out how to get the job done in a couple hours. No monads or linear algebra required!