|
|
|
|
|
by fredoralive
1557 days ago
|
|
Successful architectures seem to need a certain degree of pragmatism. ARM isn't exactly the RISCiest RISC, nor is AMD64 as baroque as the outer limits of CISC like iAPX 432. FJCVTZS is an example of pragmatism, the JavaScript spec says float to int should be done the way that x86 does it, the original ARM FCVTZS (no J) didn't do it the same way, but JavaScript is so important you have to add a special case. I hope I'm not mischaracterising the RISC-V side, but I seem to recall their argument against things like FJCVTZS was that that there should be some standard set of instructions that compilers should emit for that special case, and the instruction decoder on high end CPUs should be magic enough to detect the sequence and do optimal things (fused instructions?). Which kinda felt like "we must keep the instruction set as simple as possible, even if it makes the implementation of high performance CPUs complex". See also the "compressed instructions" stuff, which feels again like passing the buck for complexity onto the CPU implementation side (unless it's just a Thumb like 16 bit wide instruction set thing given a misleading name). |
|
The compressed instructions are quite lightweight. It's generally an assembly level thing, and the decoder on the cpu side is apparently ~400 gates.
The compressed instructions are indeed a 16 bit wide thing, but fixing some of the flaws in Thumb. Generally they have more implicit operands or operands range over a subset of registers to fit in 16 bits.
But the hat trick is these two dovetail into each other, such that a sequence of compressed instructions can decompress into a fuse-able pair/tuple, which then decodes into a single internal micro op. This creates a way to handle common idioms and special cases without introducing an ever growing number of instructions. Or at least that's the basic claim by the RISC-V folks. I think they've done enough homework on this to not be trivially wrong, so it'll be interesting to see how things go.