Hacker News new | ask | show | jobs
by FullyFunctional 1645 days ago
My day job for 6+ years is implementing high perf RISC-V cores and my name is in many of the RISC-V specs.

Variable length ISAs are characterized by not being able to tell the beginning of an instruction without knowing the entrypoint. This applies to RISC-V with compressed instructions. Finding the boundaries is akin to a prefix scan and has a cost roughly linear in the scan length, but IMO the biggest loss is that you can’t begin predecode at I$ fill time.

3 comments

It sounds like you regret the decision to make RISC-V variable length. Is that correct?
I fought against making the _current_ way to do compressed instructions a mandated part of the Unix profile, but RISC-V was (at least at the time) dominated by microcontroller people and there was a lack of appreciation of the damage it incurred. A lot of people far more senior than me couldn't believe what happened.

Interesting to contrast with Arm which upon defining Aarch64 did _away_ with variable length instructions and thus also page crossing ones. Maybe they knew something.

Can't you predecode speculatively, then redecode if you see a compressed instruction? Also I assume the bottleneck there is instruction cache, no?
> IMO the biggest loss is that you can’t begin predecode at I$ fill time.

That helps enough to overcome the increased code size?

I really wouldn't say they learned nothing from x86, though. You only have to look at 2 bits, and if you can get your users to put in the slightest effort then compilers can be told not to use C.

That's a false strawman. There are infinitely many ways to achieve the same or better density without the drawback. Allowing instruction to span cache line, or even pages, is a mistake that we'll pay for forever.

The simplest possible mitigation would have been to disallow an instruction from spanning a 64-byte boundary. It would have almost no impact on instruction density, but it would have saved a lot of headaches for implementations.

Strawman? I wasn't even trying to characterize anyone else's point, I was just trying to list some significant improvements over x86.

> The simplest possible mitigation would have been to disallow an instruction from spanning a 64-byte boundary.

Sure, that sounds good. But before this you hadn't even mentioned any problems with split instructions that need to be mitigated.

(You did mention decoding without a known entry point, but a rule like that doesn't guarantee you can find the start of an instruction. And if it would help to know that a block of 64 bytes probably starts with an aligned instruction, that seems like something you could work out with compiler writers even without a spec.)

I did forget to mention the requirement that you can't branch into the middle of an instruction. If you have both of these constraints then you can unambiguously determine the location of all instructions in any aligned 64-byte block, including at I$ fill time.

Implementing this would require instruction fetch to take an exception on line-crossing instructions (which must be illegal) and a change to the assembler to insert a 16-bit padding nop or expand a compressed instruction to maintain the alignment. There is nothing needed from the compiler (or linker AFAICT). JITs will have to be aware though.

You'd also need to guarantee that there are no constants or other non-instruction data in the same cache line as instructions. If that's a reasonable constraint then sure, that sounds like it would be helpful.