Hacker News new | ask | show | jobs
by sylware 873 days ago
Yep, same, I am keeping an eye on Oasis, but to run powerful GPU drivers (much user space would have to be ported from c++ to hand written risc-v assembly, SDK included). Don't rush it though, concurrent access and memory coherence of device memory is still not finalized.

I have been coding kind of a lot x64 recently, the limitation of 16 GPRs has been painful. I am sure that when I will crank up on rv64 assembly programming, those 32GPRs will feel like fresh air.

In the other hand, I am not fond of the ABI register names, and the pseudo-instructions involving mini-compilation. I'll stick to xNUMBER register names and won't use pseudo-instructions. Like I will avoid any abuse of the preprocessor.

1 comments

>In the other hand, I am not fond of the ABI register names

Why? They're simple substitution, and very helpful with following ABI.

>and the pseudo-instructions involving mini-compilation.

Again, why? These aren't specific to the assembler used, but rather, defined in the specification itself. This means they are reliable, and will always be there for as long as you use a RISC-V compliant assembler.

They are thus also the register names you will see in disassembler output, debuggers and other tools.

Also, you might be interested in this new RVA22+V board[0].

0. https://forum.banana-pi.org/t/leading-the-future-of-computin...

The standard pseudo-instructions are not just standard. They express idioms that get treated differently, sometimes also by hardware.

For example `li` gets expanded by the assembler into `liu` and `addi` which on larger RISC-V cores get recognised and fused back into a single op. Using `xori` instead of `addi` would have had the same result but wouldn't get fused.

Next, some idioms get recognised and automatically assembled into "compressed" 16-bit instructions to save space. For example "mv rd,rs" and "addi rd, rs, 0" both get assembled into "c.mv rd,rs". And on a larger RISC-V core, "c.mv" could be only a register rename in the decoder, thus taking 0 cycles.