|
|
|
|
|
by copx
4515 days ago
|
|
While we are at stupid microbenchmarks. The following loop in C (line being a struct, length is int): for (int i = 1; i <= 400000000; ++i) {
++line.length;
}
Compiles to the following ASM: @8:
inc eax
dec edx
jne @8
..as expected. The compiler realized that there is no need to actually copy the intermediate values to line.length and does everything in registers.Now here is the same loop in Lua (everything dynamically typed, line.length must be hashed (in theory) just like in Python): for i = 1, 400000000 do
line.length = line.length + 1
end
LuaJIT 2.1 generates the following ASM: ->LOOP:
addsd xmm7, xmm0
movsd [rax], xmm7
add edi, +0x01
cmp edi, 0x17d78400
jle 0x7fee13effe0 ->LOOP
The C program executes in ~0.3s, the Lua one in ~0.5s .. and those 0.5s include LuaJIT's startup, profiling, and JIT compilation time. So for a longer running program the difference would be even smaller.Tl;dr: modern JIT compilers are amazing and can optimize away the things mentioned in the article. |
|
For example, this is the entire program with your loop and a line to print the result (gcc 4.3.4/linux -O3),
In fact without the printf all you get is "reqz ret" (not counting .init overhead in the binary). That is because the compiler detects that line.length is not used and fails to even set it.