| As someone interested in understanding/getting into assembly language (actually for years), I have a small torrent of naive and hopefully not too annoying questions. :) - Why and how is argv[1] at [esp+8]? I realize this is a Linux-specific implementation detail but would like to understand. - argv[1] is "gotten" by reading from esi. You then proceed to poke eax and ecx. TIL that SI, AX and CX were implicitly linked. What specifically is going on here? - Continuing along the implicit linked-ness train, how is accessing AL reading the value set in SI? - How does the LEA usage here work? You're putting the address of "[ecx+eax-'0']" into ecx. First question, how is ecx not clobbered? Secondly, how does that dereference work? I see similar semantics used in the 2nd and 3rd instructions, it seems AX and CX are linked in some way (in this situation). - So the jecxz is bailing out (to an assumed label, that's okay) if ecx is 0. Cool. But why do you now zero and then increment eax? - At the end you have a loop around a single xadd instruction, (which, if I understand what's happening, is the equivalent of "edx += eax" here?). The only way I can interpret this is that the atoiloop routine was actually building up a stack (in the LEA bit) and this is ticker-tape-ing its way through that? I appreciate any insight and your patience :) if I'm learning something unfamiliar enough to be completely disorientating, reading through insights others have written down (ie, the questions others asked and documented the answers to) doesn't seem to help. I have to ask a thousand questions of my own in order to get my bearings. This could be the fault of the assembly "tutorials" out there though. I can't remember the number of articles I've eye-glazedly read through; I understand binary, the stack, the basics, etc, incredibly incredibly well, but (as you can see) have tons of blocking issues and holes in my understanding. Some I'm aware of - for example, I know I have no mental model of how x86 memory management is done, and I've never been able to find anything concise that just deals with that - but unfortunately some of the holes are so big all I know is that something's missing and that my eyes glaze over and I don't know why. In any case, learning difficulties FTL. Been wanting to understand asm since 2006. NB. While responding to another answer I stumbled on https://codegolf.stackexchange.com/a/135618/15675, a(n exhaustingly) exhaustive exploration of Fibonacci in x86, also for Linux. |
The kernel puts them there when the program starts; I believe this is part of the System V calling convention so it's not Linux-specific.
> argv[1] is "gotten" by reading from esi. You then proceed to poke eax and ecx. TIL that SI, AX and CX were implicitly linked. What specifically is going on here?
They are not linked, the lodsb instruction loads a byte from the address pointed at by esi into ax.
> How does the LEA usage here work? You're putting the address of "[ecx+eax-'0']" into ecx. First question, how is ecx not clobbered? Secondly, how does that dereference work? I see similar semantics used in the 2nd and 3rd instructions, it seems AX and CX are linked in some way (in this situation).
lea just does math, it doesn't dereference anything. ecx is being clobbered, but that's what we want: the lea is doing the equivalent of ecx += eax - '0'.
> So the jecxz is bailing out (to an assumed label, that's okay) if ecx is 0. Cool. But why do you now zero and then increment eax?
This will be the return value of main and the exit code of the program. Since you errored, you want to return a nonzero value.