Hacker News new | ask | show | jobs
by cperciva 6081 days ago
If you want to learn x86-64 assembly language, I recommend a historical approach: Start by learning 8086 assembly language, then learn 80386 (i.e., 32-bit x86) assembly language, then learn MMX/SSE assembly language; and only once you've mastered all of those, start on x86-64.

The x86 instruction set is kludges built upon kludges, and you're never going to understand it fully if you try to jump in at the end without seeing how it developed.

3 comments

I learned asm starting with MIPS and then had the luxury of working at a place that was designing MIPS hardware. It was nice to be able to get into the assembler when some kernel modification wasn't working very well.

I have only good things to say about MIPS as a great place to learn ASM. The thing is, the better you understand how processors and pipelines work, the better you'll understand why instruction sets are the way they are.

If you want to learn a very bad assembler (for programmers) but one that's very to understand (for microcontroller designers), there's always 68HC11. How do two 8-bit registers and two 16-bit registers make you feel? (Probably like going to/from memory quite a bit :)

Agreed, the biggest switch is probably from 286 to 386.

486 and Pentium didn't really do much instruction set wise (sure there were changes), then the next big change is the 64 bit systems.

edit: What do you mean kludges ? It's the most orthogonal instruction set known to man!

Forget everything about the code, data, and stack segments.
The confusion and the reason you got modded down is that people don't understand that the segmentation is still present in the CPU, but that common practice now is to overlay all segments as a single 4GB space.

In other words, if you really really insisted you could mess around with giving the segment registers different base values and / or lengths but you'd probably end up regretting it.

What do you mean?
He means that you don't need those anymore.

You used to need them (badly), especially on the smaller CPUs because otherwise you were severely limited in memory.

Check out the 'mixed models' that were pretty common usage in the 80's.

There were lots of them:

  code          data            model name
  under 64KB	under 64KB	Small (-ms) or Tiny (-mt)  
  over 64KB	under 64KB	Medium (-mm) 
  under 64KB	over 64KB	Compact (-mc) 
  over 64KB	over 64KB	Large (-ml) 
I was so happy when I finally got out of 'model hell' and could use 'flat' mode using DJGPP. Finally C programming without the headaches.

http://en.wikipedia.org/wiki/DJGPP

You're going to see a lot of references to segment selectors, and apart from the fact that FS may refer to thread information, you're never going to have to remember what any of it means. You can safely consider segmentation to be detritus.
Specialized segment prefixes mean jacksquat in an execution model with no specialized segments.
Do you have any documentation you'd recommend for this, I'm not sure what to read/study.

Thanks for your help cperciva.