Hacker News new | ask | show | jobs
by adastra22 102 days ago
Also the bit manipulation extension wasn't part of the core. So things like bit rotation is slow for no good reason, if you want portable code. Why? Who knows.
3 comments

> Also the bit manipulation extension wasn't part of the core.

This is primarily because core is primarily a teaching ISA. One of the best parts about RiscV is that you can teach a freshman level architecture class or a senior level chip building project with an ISA that is actually used. Anything powerful to run (a non built from source manually) linux will support a profile that bundles all the commonly needed instructions to be fast.

Bit manipulation instructions are part and parcel of any curriculum that teaches CPU architecture. They are the basic building blocks for many more complex instructions.

https://five-embeddev.com/riscv-bitmanip/1.0.0/bitmanip.html

I can see quite a few items on that list that imnsho should have been included in the core and for the life of me I can't see the rationale behind leaving them out. Even the most basic 8 bit CPU had various shifts and rolls baked in.

This is the reason behind the profiles like RVA23 which include bitmanip, vector and a large number of other extensions. Real chips coming very soon will all be RVA23.
Neat. I can't wait to get my hands on a devboard.
The earlierst I know of coming is the SpaceMit K3, which Sipeed will have dev boards for.
The Milk-V Jupiter 2 (coming out in April) is RV23 too
Nice board but very low on max RAM.
32-bit barrel shifters consume significant area and RISC-V was developed to support resource constrained low cost embedded hardware in a minimal ISA implementation.
The 32-bit ARM architecture included a barrel shifter as part of its basic design, as in every instruction had a shift field.

If a CPU built in 1985 with a grand total of 26 000 transistors could afford it, I am pretty sure that anything built in this century could afford it too.

26k is a lot of transistors for an embedded MCU.

You'd be excluding many small CPUs which exist within other chips running very specialized code.

As profiles mandate these instructions anyway, there's no good reason to complicate the most basic RISC-V possible.

RISC-V is the ISA for everything, from the smallest such CPUs to supercomputers.

What MCUs are you thinking of?

To the best of my knowledge (and Google-fu), 26K really isn't a lot of transistors for an embedded MCU - at least not a fully-featured 32-bit one comparable to a minimal RISC-V core. An ARM Cortex M0, which is pretty much the smallest thing out there, is around 10K gates => around 40K transistors. This is also around the same size as a minimal RISC-V core AFAICT.

The ARM core has a shifter, though.

IIUC this is a lot less true in the modern era. Even with 24nm transistors (the cheapest transistor last time I checked), modern microcontrollers have a fairly big transistor budget for the core (since 80+% of the transistors are going to sram anyway).
You can save a lot of silicon by doing 8 or 16 bit shifters and then doing the rest at the code generation level. Not having any seems really anemic to me.
It was the case even 15 years ago when Cortex M0/M3 really started to get traction, that the processor area of ARM cores was small enough to not make a difference in practice.
Yeah I don’t get it. Shifts and rolls are among the simplest of all instructions to implement because they can be done with just wires, zero gates. Hard to imagine a justification for leaving them out.
> One of the best parts about RiscV is that you can teach a freshman level architecture class or a senior level chip building project with an ISA that is actually used.

Same could be said of MIPS.

My understanding is the RISC-V raison d'etre is rather avoidance of patented/copywritten designs.

As you indicate, MIPS was widely used in computer architecture courses and textbooks, including pre-RISC-V editions of Patterson & Hennessy (Computer Organization & Design) and Harris & Harris (Digital Design and Computer Architecture.

In spite of the currently mediocre RISC-V implementations, RISC-V seems to have more of a future and isn't clouded by ISA IP issues, as you note.

the avoidance of patent/copyright is critical for (legally) having students design their own chips. MIPS was pretty good (and widely used) for teaching assembly, but pretty bad for teaching a class where students design chips
This is largely contradicted by the (pre RISC-V) MIPS editions of Patterson & Hennessy, Harris & Harris, etc., which teach you how to design a MIPS datapath (at the gate level.)

Regarding silicon implementations, consider that 1) you can synthesize it from HDL/RTL designs using modern CAD tools, and 2) MIPS was originally designed to be simple enough for grad students to implement with the primitive CAD tools of the 1980s (basically semi-manual layout).

MIPS patents have long expired too (and incidentally for any other CPU released prior to 2006), so that's a moot point.
> This is primarily because core is primarily a teaching ISA.

That doesn't necessarily make it all that great for industrial use, does it?

> One of the best parts about RiscV is that you can teach a freshman level architecture class or a senior level chip building project with an ISA that is actually used.

You can also do that with Intel MCS-51 (aka 8051) or even i960. And again, having an ISA easily implementable "on a knee" by a fresh graduate doesn't says anything about its other technical merits other than being "easily implementable (when done in the most primitive way possible)".

The fact the Hazard3 designer ended up creating an extension to resolve related oddities was kind of astonishing.

Why did it fall to them to do it? Impressive that he did, but it shouldn't have been necessary.

Which extension is that?
An extension he calls Xh3bextm. For extracting multiple bits from bitfields.

https://wren.wtf/hazard3/doc/#extension-xh3bextm-section

There are also four other custom extensions implemented.

This extension wasn't strictly necessary but it makes decode of Arm instructions faster in the bootrom's Arm emulator.
Do you typically care about portability to the degree that you want the same machine code to execute on both a Linux box and a microcontroller? Why?