|
x86-64 is really oriented around integer values being 64-bits. For example, 32-bit operations will zero-extend the result to write the full 64-bit integer register. The ABI also assumes integral values are promoted to 64-bits and the stack is 64-bit aligned on calls. That said, as long as you keep RSP aligned you can do whatever you want. Consider this code: extern void value(int* a, int* b, int* c);
int main() {
int a, b, c;
value(&a, &b, &c);
return a+b+c;
}
This is how LLVM compiles it: subq $24, %rsp
leaq 20(%rsp), %rdi
leaq 16(%rsp), %rsi
leaq 12(%rsp), %rdx
callq value
movl 16(%rsp), %eax
addl 20(%rsp), %eax
addl 12(%rsp), %eax
addq $24, %rsp
retq
Note that the int values are allocated at 4-byte alignment, but rsp is aligned to 8-bytes. If you add an additional parameter, 'd', you'll see that the compiler still allocates 24-bytes of stack, and stores the additional parameter at 8(%rsp) (which is unused in the code above). |
If you want to call C code conforming to the x86-64 SYSV ABI, RSP needs to be aligned to 16 bytes when you execute the call. If the code you generate never calls alien code, 8 byte alignment is enough.
Since 8 bytes are occupied by return address pushed by the call which started your function, you need to decrease RSP by further 8, 24, 40, 56, 72, ... bytes before calling code generated by others.
Reason: having stack 16 byte aligned makes it easier to allocate aligned 16 byte stack variables and this is useful because x86 has 16 byte registers (SSE) which are most efficiently loaded/stored to aligned addresses.
However, it isn't only performance that you lose by neglecting alignment. I learned the hard way that some code generated by gcc crashes if you call it with unaligned stack.
That's why in this example LLVM allocates 24 bytes, even though 16 would be enough for 3 ints.
Another example (gcc):
To anyone writing x86-64 compilers, I recommend finding the x86-64 SYSV ABI spec and reading it. Saves debugging time.