I don't think Go emits SIMD at all. Their assembler doesn't even support parsing it.
I think these are probably just bugs that we need to look at. The benchmarks where we do worse are the string benchmarks; perhaps it's our Unicode correctness that is hurting us, or something like that.
I should look into those benchmarks if you think there might be string problems. To be frank, I've never looked too closely at anything except for regex-dna.
I think it was recent, yeah. I also recall there being problems with their assembler not being able to parse various SIMD instructions in the past. I also recall seeing code like in your link too.
Hmm, Go 1.7 introduced support[1] for various AVX instructions (plus at least one SSE 4.2 instruction), some of which are used in blake2b-simd.
I can't wait to get SIMD on Rust stable. It's going to be exciting.
I think these are probably just bugs that we need to look at. The benchmarks where we do worse are the string benchmarks; perhaps it's our Unicode correctness that is hurting us, or something like that.