Hacker News new | ask | show | jobs
by dnautics 2684 days ago
I don't know why this is being down-voted.

A few things: ARM and RISC-V definitely have specEx baked in (though you can not include SpecEx module on RISC-V). There are interesting alternatives to SpecEx. DSPs use delay slots, and I've seen delay slots used quite well in a GP-CPU. Getting high instruction saturation on a CPU with delay slots is a "hard compiler problem", but I have a few things to say about that:

Despite jokes about "better compilers", compilers are getting better (e.g. polyhedral optimization). One way to think of what OOOex/SpecEx is that it's figuratively the CPU JITting your code on the fly. The most popular programming language JITs aggresively anyways so one wonders if there isn't some reduplication going on.

Furthermore, the most popular programming language isn't entirely the most raw-power performant, and it's pretty clear that in our current ecosystem just pushing operations through the FPU (which is what x86 optimizes for) isn't necessarily the most important thing in the world; uptime, reliability, fault-tolerance, safe paralellization, distribution, and power conservation might be more important moving forward.

HM, oops, apparently RISC-V has OOOEx, not SpecEx.

2 comments

I understand this is nitpicking, but it's not accurate to say "RISC-V has speculative execution" or "ARM has OoO execution" and that they therefore suffer from spectre and friends.

RISC-V/ARM are specifications of instruction sets, for which there exists an enormous domain of possible implementations. Spectre/Meltdown are not inherent features of Instruction set architectures. They are emergent properties of certain implementations of those instruction set architectures.

For example, the BOOM implementation of RISC-V does out of order execution. The Rocket chip implementation does not. Both implement the RISC-V architecture.

I'm not replying to you specifically. But I see this sort of thing on HN all the time and I feel like it's an important distinction to make.

Thank you, I should have been more careful. Spectre and meltdown are in fact specific interactions that happen because OOO and specex are hard and it's easy to mess up given the high level of statefulness and complexity in contemporary chip designs (in this case - memory caching). But ooo and specex make chip architectures difficult to reason about and I'm sure more errors will emerge.
> Despite jokes about "better compilers", compilers are getting better

The compiler has to make static decisions. The hardware knows what is actually happening. There is an inherent information asymmetry at work that a "sufficiently smart" compiler seems unlikely to overcome.

My intuition says software can't beat the speed of a superscalar OOO CPU anymore than a GP CPU can beat a roughly equivalent DSP for algorithms suitable to run on the DSP, but I have no proof for that.

I'll also note that we've been promised "smarter compilers" for decades. Intel has tried that route several times. No one has ever made it work.

> The compiler has to make static decisions.

Pretty sure I mentioned JITting in my comment.

> My intuition says software can't beat the speed of a superscalar OOO CPU anymore

How good is good enough? I mean we have distributed tensor flow which is basically on the fly compilation that can reorganize your computational graph around nodes with gpus separated by network latency, or Julia where you can drop in a GPUarray as a datatype and move computation to the GPU without changing your code.

If we go to something a bit more baroque, java is within 1.5 of c/c++ these days

Could you hand roll a better solution? Probably. Would it be worth it? Doubtful.

I think it's definitely worth exploring this angle because modern JIT compilers have become very advanced, and there's still a lot of juice left to squeeze there. Look at some of the things Graal is doing and it looks a lot like what OOO speculation is doing - it'll recompile branches on the fly based on profiling information and things like that.
Nvidia Denver couples a software based jit/translator with an inorder VLIW backend. It is vulnerable to spectre.