Hacker News new | ask | show | jobs
by userbinator 4331 days ago
The widening gap between memory and core speeds suggests to me that traditional RISC philosophy is not the way forward for performance and efficiency; fixed-length instructions, load-store restrictions, and delay slots may make implementation easier and faster at a time when memory could keep up with the CPU and instruction decoding was the bottleneck, but now that memory is often the bottleneck, it makes sense to have more complex, dense instruction encoding and the other features that are usually left out of RISCs, but improve code density.

Variable-length instructions are especially beneficial to code density, since often-used instructions can be encoded in fewer bytes, leaving rarer ones to longer sequences. It also allows for easy extension. Relaxing the restriction on only load/store instructions being able to access memory can reduce code size by eliminating many instructions whose sole purpose is to move data between memory and registers; this also leads to requiring fewer explicitly named registers (since instructions reading memory will implicitly name an internal temporary register(s) the CPU can use), reducing the number of bits needed to specify registers.

Other considerations like number of operands and how many of them can be memory references also contribute to code density - 0- and 1-operand ISAs require far more instructions for data movement, while 3-operand ISAs may waste encoding space if much of the time, one source operand does not need to be preserved. 2 operands is a natural compromise, and this is what e.g. ARM Thumb does.

This is why I find the description of "compressed RISC-V" linked in the article ( http://www.eecs.berkeley.edu/~waterman/papers/ms-thesis.pdf ) interesting - benchmark analysis shows that 8 registers are used 60% of the time, and 2-operand instructions are encountered 36/31% statically/dynamically. These characteristics are not so far from those of an ISA that has remained one of the most performant for over 2 decades: x86. It's a denser ISA than regular RISCs, and requires more complex decoding, but not as complex as e.g. VAX. I think the decision to have 8 architectural registers and a 2-operand/1-memory format put x86 in an interesting middle-ground where it wasn't too CISC to implement efficiently, but also wasn't RISC enough to suffer its drawbacks. I'd certainly like to see how an open-source x86 implementation could perform in comparison.

3 comments

You are not quite right, to put it mildly.

The real problem is energy efficiency, it shows its head everywhere, from embedded systems and tablets to supercomputers. Even desktops are affected - you can cut cost of machine by having less costly power supply.

Most of the time you are not constrained by information stored in instruction cache for RISC CPUs, because most of time is being spent in some tight loop. And you can see how hard it is to create an energy efficient implementation for x86, partially (and in noticeable part) because x86 decoder is complex.

> Variable-length instructions are especially beneficial to code density, since often-used instructions can be encoded in fewer bytes,

Speaking of this, I find it interesting that ARM went back to a fixed 32-bit instruction width for ARMv8 (from 16/32 in Thumb-2). Any idea why they chose to do this?

ARMv8 is targeted at very high end phones but mainly at servers (of course it will creep down into cheap feature phones eventually). My server has 16 GB of RAM which is small for an ARMv8 server. So memory pressure may be not such a problem.

However it's also worth saying that Cortex-A53 can run Thumb-2 instructions. Not sure about Cortex-A57.

According to http://en.wikipedia.org/wiki/ARM_Cortex-A57 the Cortex-A57 supports Thumb-2 instructions.
I suppose now is the time for me to patent my idea of Huffman-coded instruction sets. Avoid wasting a single bit!