Hacker News new | ask | show | jobs
by timewizard 535 days ago
> The CPU itself couldn't care less which registers you use for what.

Not all registers encode as operands equivalently (implicit rdx:rax, implicit [rbx+al], limited [rbp/r13+imm8]). Some have other encoding restrictions or special purposes (rdi, rsi, rcx). When segmentation was a thing there were different default segments for each. Some are destroyed when certain opcodes used (syscall: rcx, r11).

> So many wasted bytes on moving values between registers [...] Modern CPUs are fast

Well, they've special cased this anyways, as these will often be caught in the rename stage and not even occupy an execution slot. Since we've long recognized that passing these values in registers instead of the stack is far more efficient, which is why the `fastcall` convention came about and got it's name way back in the x86 days.

> but there's still tons of inefficiency in compiler output.

Which is also why the 'inline' heuristic exists. In which case all of the calling conventions are fully abandoned. I mean, things like ELF dynamic symbol tables, and linux thread local storage annoy me far more than calling conventions ever have.

1 comments

Well, they've special cased this anyways, as these will often be caught in the rename stage and not even occupy an execution slot

They still need to be fetched and decoded, and take up space in caches and RAM that could be used for more purposeful instructions.

Which is also why the 'inline' heuristic exists.

Inlining has its own problems too.

I mean, things like ELF dynamic symbol tables, and linux thread local storage annoy me far more than calling conventions ever have.

Don't get me started on the whole ELF and dynamic linking situation...