Hacker News new | ask | show | jobs
by slack3r 2670 days ago
Lol, my comment was restricted to SIMD, which definitely does not take 10 minutes.

Secondly, no mainstream compiler actually compiles code to the PCMPISTRI instruction. Presumably it was meant to be used directly as assembly. I'm not sure why you are bringing in this obscure instruction into the discussion of superoptimizers.

I personally have a paranoid fantasy where NSA/GCHQ introduced this instruction to speed up password cracking. :D

1 comments

You're citing the classic superoptimization paper, although it's actually quite old at this point. The goal of superoptimization has always been to try to compile to the no-real-C-equivalent instructions in an ISA, which includes the weird vector instructions such as PCMPISTRI. (I believe Strata attempted to match some of these instructions, and I think they managed to get formulas for about half of the weird immediate forms--these are the instructions that caused them to match "half" an instruction).

In any practical vectorized tight inner-loop, the block you're trying to optimize is inherently going to be large. Superoptimization is exponential in the size of the block being optimized, which limits its utility. That was my entire point: it becomes unacceptably expensive way too quickly to get used in compilers. (Some of the code I'm looking at right now has 100s of instructions in a single basic block, definitely not atypical for a compiler).

Don't be ridiculous, dude. No compiler is actually compiling to PCMPISTRI. It's meant to be used directly as assembly.

Of course I could be wrong. If you do know of any open-source compiler which compiles code to PCMPISTRI, let me know.

Superoptimization is not exactly exponential, it is NP-hard.

Of course, in the paranoid conspiracy I was referring to in my previous comment I predict the NSA/GCHQ is also using this directly via assembly. xD