|
> This is very surprising for a language that targets somewhat high performance. > Looks like it's a 5-10% performance hit, but makes it easier to provide good backtrace information Very interesting indeed. Well, forcing your compiler to start passing parameters via stack is pretty
simple and it's not so uncommon. And probably (and maybe) the Go team just wanted to make things simple.
Otherwise they would need to have two parameter passing implementations
and sometimes even use a mixed one - when platform does not have enough
registers (e.g. 6 CPU registers and a function which takes 6+ params),
so some parameters would be passed via registers, some via stack; or
maybe all of them via stack. And so on. So I am byuing the "good backtrace"
point. For example, suppose I have a silly logging function my_func() [C code],
which takes 8 params: __attribute__ ((noinline))
void my_func(const char *module, const char *func, const char *level,
int line, int B, int C,
const char *app, const char *session)
{
printf("%s:%s:%s:%d %d %s %s\n", module, func, level,
line, B + C, app, session);
}
int main()
{
my_func("core", __func__, "error", 1, 2, 3, "a.out", "dummy");
return 0;
}
Let's compile it with gcc (-O2) for ARM and let's take a look at what
main() does: 000103d8 <main>:
103d8: e52de004 push {lr} ; (str lr, [sp, #-4]!)
103dc: e3002630 movw r2, #1584 ; 0x630
103e0: e3402001 movt r2, #1
103e4: e24dd014 sub sp, sp, #20
103e8: e3003638 movw r3, #1592 ; 0x638
103ec: e3403001 movt r3, #1
103f0: e3001600 movw r1, #1536 ; 0x600
103f4: e3401001 movt r1, #1
103f8: e58d200c str r2, [sp, #12]
103fc: e3000628 movw r0, #1576 ; 0x628
10400: e3400001 movt r0, #1
10404: e58d3008 str r3, [sp, #8]
10408: e3a02003 mov r2, #3
1040c: e3a03002 mov r3, #2
10410: e58d2004 str r2, [sp, #4]
10414: e58d3000 str r3, [sp]
10418: e3002620 movw r2, #1568 ; 0x620
1041c: e3402001 movt r2, #1
10420: e3a03001 mov r3, #1
10424: eb00004c bl 1055c <my_func>
10428: e3a00000 mov r0, #0
1042c: e28dd014 add sp, sp, #20
10430: e49df004 pop {pc} ; (ldr pc, [sp], #4)
Looks like a bunch of stores to stack ptr: + 0 bytes; + 4 bytes; + 8 bytes; + 12 bytes.[Edit: should have passed __LINE__ instead of hardcoded 1, but that doesn't change the assembly.] |