Hacker News new | ask | show | jobs
by tptacek 3874 days ago
Hm. 12 sentences. Someone here can do it better in fewer sentences, I think.
2 comments

Much of assembly programming is variants of these commands:

    - mov <dst>, <src>            | dst = src
    - sub <dst>, <src a>, <src b> | dst = a - b
    - jump <label>                | continue running at <label>
    - jump if equal <label>       | continue running at <label> if both sources of
                                    the last command were equal
    - call <label>                | save the current location, then continue
                                    running at <label>
    - ret                         | continue running after the previous CALL
                                    instruction
<dst> can be:

    - any memory address
    - any of 8-32 available temporary integer variables called "registers"
<src> can be any valid <dst> or a hardcoded integer known as an "immediate"
This is both more detailed than mine and uses fewer sentences. My only nit is that I feel like understanding status flags is really important.
I had several nitpicks with mine, but realized once the learner starts asking questions about gaps they're in a pretty good place to find an answer on their own.

I do agree flags is less obvious, but also takes a bit of space to list what kind of flags one might see:

"jump if equal" is facilitated by an implicit "flags" register which is set after most arithmetic or comparison operations like "sub". Possible flags include:

- carry: the last arithmetic operation overflowed

- zero: result of the last operation was zero

- parity: result of the last operation was odd

- sign: result of the last operation was negative

In the context of reverse engineering I feel like learning assembly this way is 'doing it wrong'. Of course, there's no such thing as 'doing it wrong' when it comes to learning, but here's what worked for me.

What worked for me was going in the opposite direction: start with simple C programs [0], and see what compilers do with them. If you understand C, you already kind of understand how the machine works, though without the machine specifics. If you see an assembly instruction that you don't recognize, check the manual [1]. You can do this online these days, say, with [2]. Here's an example of a simple program that covers calls and branches: http://goo.gl/DKrYrE

Then, use an interactive debugger (like OllyDbg or whatever works for your platform) to trace through your small programs, instruction by instruction, and see how memory and registers get manipulated at each step. Change instructions and see what happens. This will also make you familiar with common compiler idioms, which is very useful in RE work. Once you get reasonably familiar with these small programs, try your hand at a program you _don't know_, or increase the complexity of your small programs. Rinse and repeat.

[0] The choice of C here is relevant, since many other compiled languages tend to add a lot of cruft to their binaries.

[1] http://www.felixcloutier.com/x86/

[2] http://gcc.godbolt.org/

I would generally agree that writing/manipulating assembly language is a better learning device than black-box reversing (I mentioned reversing because of the context of the article).

But learning to reverse assembly language is also one of those daunting tasks that turns out not to live up to its scary reputation. I wouldn't want to suggest that you can't just dive in and learn to reverse if reversing is your actual goal.

My bad; I somehow skipped over your paragraph saying to 'dive in', which renders all of my disagreement invalid.
No, your disagreement is awesome stuff. I'm glad I got you to write it down.