| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by skemper911 2838 days ago

In rust this looks rather painful. Might I suggest trying this in Julia, might be a good comparison of performance, ease of use, and readability. Julia does a very nice job of compiling directly to SIMD instruction and lets you inspect the low level code generated.

inline function sin9_shaper(x) c0 = 6.28308759 c1 = -41.33318707 c2 = 81.39900205 c3 = -74.66884436 c4 = 33.15324345

    a = abs(x - round(x)) - 0.25
    a2 = a * a
    ((((a2 * c4 + c3) * a2 + c2) * a2 + c1) * a2 + c0) * a

end

function gen_sinwave(freq, init=0.0, step=0.1) wave = [sin9_shaper(x) for x = init:step:freq] end

julia> @code_native gen_sinwave(1113.0); .section __TEXT,__text,regular,pure_instructions ; Function gen_sinwave { ; Location: REPL[60]:2 pushl %ebx decl %eax subl $48, %esp vmovaps %xmm0, %xmm2 ; Function gen_sinwave; { ; Location: REPL[60]:2 decl %eax movl $769501344, %eax ## imm = 0x2DDDA8A0 addl %eax, (%eax) addb %al, (%eax) decl %eax movl $773805120, %ecx ## imm = 0x2E1F5440 addl %eax, (%eax) addb %al, (%eax) vmovsd (%ecx), %xmm1 ## xmm1 = mem[0],zero decl %eax movl %esp, %ebx vxorps %xmm0, %xmm0, %xmm0 decl %eax movl %ebx, %edi calll %eax decl %eax movl $769544608, %eax ## imm = 0x2DDE51A0 addl %eax, (%eax) addb %al, (%eax) decl %eax movl %ebx, %edi calll %eax ;} decl %eax addl $48, %esp popl %ebx retl nopw %cs:(%eax,%eax) ;}

end # module

3 comments

coder543 2837 days ago

Your formatting didn't work, but no, it doesn't "look painful."

What you wrote does not guarantee vectorization, it just relies on autovectorization.

Rust already does autovectorization magically behind the scenes thanks to LLVM (which Julia also uses), but explicit SIMD makes it a guarantee.

link

byt143 2837 days ago

Here is how to do explicit SIMD in julia: https://github.com/eschnett/SIMD.jl

link

kristofferc 2837 days ago

Your code only shows that

    gen_sinwave(113.0)

calls

    gen_sinwave(1113.0, 0.0, 0.1)

(which are the default arguments). It is usually better to use @code_llvm because the LLVM IR is typically easier to read.

    julia> @code_llvm gen_sinwave(1113.0);
    
    ; Function gen_sinwave
    ; Location: REPL[2]:2
    define nonnull %jl_value_t addrspace(10)* 
    @julia_gen_sinwave_345638059(double) {
    top:
      %1 = call nonnull %jl_value_t addrspace(10)* @julia_gen_sinwave_345638060(double %0, double 0.000000e+00, double 1.000000e-01)
      ret %jl_value_t addrspace(10)* %1
    }

However, defining

    function gen_sinwave(freq, init=0.0, step=0.1)
        data = collect(init:step:freq)
        fin9_shaper.(data)
    end

and looking at

    @code_llvm gen_sinwave(1113.0, 0.0, 1.0)

we can see that there is a lot of auto-vectorization going on

link

raphlinus 2837 days ago

From the asm you posted I have no idea whether it's optimized or not; it looks like two virtual function calls (calll %eax). Also, while I'm sure Julia is nice, the GC makes me worry about whether it'll work well in real-time audio synthesis (my primary use case).

link