Hacker News new | ask | show | jobs
by gwu78 3339 days ago
The knee-jerk answer upon seeing the title was: "Too many".

So many of these instructions I never see anyone use, ever.

No doubt there are some folks using them, but as mere mortals, do the rest of us really need all these features to control our small amount of commodity hardware? As a user, I have modest goals. Is it not true that Torvalds wrote his kernel with a similarly modest goal in mind: control over his own commodity computer?

The situation resembles that of an overcomplex software program where a majority of the features are unused by an even larger majority of its users. In other words the depth of features benefit only the very few people who use them.

Given the choice between several alternatives with differing levels of features I tend to opt for software that is less featureful and hence more simple. Call me simple-minded if you wish. The same goes for processors, although when it comes to hardware how much choice to we really have as end users? (Hobbyist boards excluded.)

For a taste of some non-x86 assmbler, I enjoyed experimenting with a MIPS simulator still found at spimsimulator.sourceforge.net. I can report that the non-GUI portion at least still compiles relatively cleanly on BSD. This simulator has been mentioned on HN several times.

I have a no problem with using a processor with fewer instructions even if I have to sacrafice something by making that choice -- I leave it to the experts to detail those sacrafices and why I would be a fool to make them. NB: I am already a fool so it may not be worth the effort.

How many HN readers have tried RISC-V? A poll for those who have not: will RISC-V inspire you to purchase a new computer?

9 comments

I often come across problems, like intermittent graphical glitches, in Ubuntu Linux applications and think to myself "maybe I'll try fixing that bug". I start digging casually, and find that the real issue is not in the application but somewhere in the graphics stack. Or maybe the kernel. Oh no, it's in a closed-source driver. Then I despair and start thinking thoughts like "wouldn't it be nice to rewrite all this from first principles? That way I could get it right!"

Then I do some more research and find that all this flaky software is built on proprietary, minimally-documented hardware with its own stack of bugs, except those bugs will never get fixed because the IP is top-secret and the only ten people who understand it have already moved on to build the next product.

So RISC-V/lowRISC is enticing because it promises an architecture that is powerful enough to be more than a toy or academic exercise, and also fully open from the ground up, which the public can iterate on and finally fix bugs - or at least understand them.

(Yes I know, I'm mixing complaints about GPUs into a CPU conversation here...)

I'm also encouraged by the slowdown in Moore's Law - alternative architectures have historically been steamrollered by Intel's phenomenal process engineering capability. If process nodes reach a plateau, the mad miniaturization march of the last 50 years will pause for breath and let a much wider and deeper variety of hardware hackers get involved.

Keep in mind that the numbers presented in the article are large because the majority of them come from a nearly cartesian product; if you consider the ALU-ish operations alone, there's the operation (add, sub, adc, sbb, and, or, cmp, xor, etc.), addressing mode (reg, mem[imm], mem[reg], mem[reg+reg * scale], mem[reg+imm], mem[reg * scale+imm], mem[reg+reg * scale+imm]), width and type (int8, int16, int32, int64, float32, float64, float80, plus vector variants: int8x8, int16x4, int32x2, int64x1, float32x4, ... ).

Approximately from the list above there are already 8x8x16 = 1024 "instructions", which within an order of magnitude correlates nicely with the estimates given in the article. The rest (probably dozens at most) are mainly for OS-oriented system management, and special operations done in hardware like AES (without which the equivalent software implementation would be orders of magnitude slower).

The main "simplification" which RISCs have done is restricted the operand types, so that e.g. most if not all the ALU ops must use register addressing modes, and all the other addressing modes (of which there are not many) are restricted to memory-register move instructions. That turns parts of the cartesian product into a sum, reducing the instruction count by the above measures, but IMHO is the wrong thing to do since now it means all software has to contain more instructions to do what the hardware would otherwise be able to figure out (and possibly optimise execution of) in a CISC. For example, x86 can express mem[reg + reg * scale + imm] = reg + mem[reg + reg * scale + imm] in a single instruction (decoded into multiple uops, which can be scheduled into whatever hardware resources are available) while a RISC would require several.

I'd be more interested in seeing the maximum amount of individual op-codes CLang and GCC can output for x86_64. I suspect there are many they don't use because their either obsolete, unnecessary or for very specific operations such as those used by video codecs.
Prediction: RISC-V will at best take on in some niche markets. If RISC-V is lucky, it will maybe displace MIPS.

--

Usual P.S.: The ISA has become less and less relevant to processor performance since the mid 90s (when Intel and AMD introduced micro-ops to the x86 world by dynamic translation) and is now less relevant than ever before. What is relevant is developer mindshare, good compilers, availability of optimized algorithms, and manufacturers pouring billions into creating better scheduling and execution for µops.

The performance is not the relevant part. If it's fast enough, it's enough. Then other things matter. For me, that "then" is already reached.
I have RISC-V hardware (got a handful of HiFive1 boards) and they're definitely the fastest arduino compatibles to date. If SiFive made an SoC for Chromebooks (maybe in collaboration with a large manufacturer like Samsung), I would buy them and suggest them to friends. I would love to have a big and wide RISC-V desktop workstation (couple hundred gigs of ram and a couple dozen cores) once all the prerequisites are in place, but that stuff takes more time it seems. The Shakti folks over at IIT Madras seem like they'll be the first ones to tape out RISC-V workstations, servers, and HPC nodes.

Privilege separation models are finnicky, so it's probably not a good idea to rush that.

I think that if we can all throw our weight behind RISC-V, the availability, diversity, and good functioning of computer platforms will improve drastically.

My favorite legacy x86 instructions are the BCD opcodes like AAA (Ascii Adjust after Addition). They are apparently now slower than doing the conversion and operations yourself, but are kept in there for compatibility with the original 8086.
They likely decode to the same sequence of uops internally; I benchmarked them and they're basically the same speed as the equivalent sequence of simpler operations, but a lot shorter. See this item for more information:

https://news.ycombinator.com/item?id=8477254

However, they were much slower in the days of the P4 (which was really an oddity in x86 performance characteristics.)

> No doubt there are some folks using them, but as mere mortals, do the rest of us really need all these features to control our small amount of commodity hardw

They aren't for mere mortals though, they are for compilers.

Elegance and simplicity is nice, but performance is king. How fast would a RISC-V core need to be to be able to compete with eg AES-NI?
>will RISC-V inspire you to purchase a new computer?

Not directly, but whatever in it inspired the lowRISC team will likely have me buying one of their boards. Mostly for the minion cores on them, which meet a need I have.

From my understanding we can't get much higher sequential instructions / second than we already have.

So the instructions have to do more each (CISC), or we have to do a lot of instructions in parallel. Maybe RISC could shine in massively parallel processing units.

Although the Intel CPUs have a CISC instruction set, internally they are converted to RISC like uOPS in the early instruction decode stage. So an CISC instruction that increments a memory location is converted into uOPs to load from memory into an internal register, increment of that register followed by a store to memory. These days, the uOPS are so powerful that they do the opposite at times like merge adjacent compare and branch instruction into one uOP.
x86 has been RISC ever since the PentiumPro. The programmer visible CISC instruction set is just a compacted form analogous to ARM Thumb.