|
I expect an inlined interface or virtual call to be the same as an inlined non-virtual call. But since the CLR (4.5, Windows 7 x64, using 32 or 64-bit codegen) won't emit an inlined virtual/interface call for int Add(int, int) -- it's slower. In my simple program doing a loop, calling an Add function on an interface, it is definitely making a function call each time. It unrolls 4 times, and loads the function pointer once per iteration - I'd have though it would only load it once overall. Loop is 89 bytes. There is no conditional inside the loop to check for the type.[1] If I change it to not use the interface (don't cast to the interface type), it's unrolled and inlined. Loop is 34 bytes.[2] It's the same on 32-bit, except there's no unrolling. The non-virtual loop body is 2 instructions (inc, add). The interface has a push, 3 movs and a call. The virtual one requires two extra movs (to load the function pointer - with an interface the address is embedded as a literal). Shrug. Maybe it still doesn't work with value types? I started it without VS then broke in with the debugger to get the disassembly. The loop is doing "y = x.Add(y, i)" where y is a local. Edit: Aha! Using an interface method (not virtual) and strings, I was able to get inlining. I guess the CLR is still weak in dealing with value types. 1: Start of the loop using an interface: lea r8d,[rdi-1]
mov rbx,qword ptr [FFEEFE60h]
mov edx,eax
mov rcx,rsi ; rsi is the object pointer
lea r11,[FFEEFE60h] ; I am embarrassed to admit I don't know what r11 is doing
call rbx ; just does lea eax[rdx+r8], ret
; similarly 3 more times then loop
2: Without using the interface, the loop body: lea eax,[r8-1] ; r8's the counter
add ecx,eax
lea edx,[rcx+r8]
lea ecx,[rdx+rax]
lea eax,[r8+2]
add ecx,eax
; then loop
|