Hacker News new | ask | show | jobs
by KMag 1206 days ago
A while back, I read a Sun patent on not implementing some instructions and emulating them in the kernel's illegal operation trap handler. The whole patent seemed obvious to me, but I'm glad that it was patented and now expired, providing obvious prior art in any attempts to patent it today.

For MIT's 6.004 "Beta" processor loosely based on the DEC Alpha AXP, our test cases ran with a minimal kernel that would trap and emulate multiply, divide, etc. instructions using shifts and adds/subtracts, so we could implement more simple ALUs and still test the full instruction set.

In any case, particularly in the world of hypervisors, it doesn't seem too hard to deprecate an instruction and stop implementing it in hardware, and push that complexity into firmware. As long as the CPU covers the Popek and Goldberg virtualization requirements, hypervisors could be nested, and the firmware could implement a lowest level hypervisor that handles unimplemented instructions.

More generally, I wish ARM64, RISC-V, and other modern ISAs had taken DEC Alpha AXP's idea of restricting all of the privileged instructions to the firmware (PALCode in the Alpha's case) and basically implementing a single-tenant hypervisor in the firmware. The OS kernel always used an upcall instruction to the hypervisor/firmware to perform privileged operations. In other words, the OS kernel for Alpha was always paravirtualized. (UNSW's L4/Alpha microkernel was actually implemented in PALCode, so in that case, the L4 microkernel was the firmware-implemented hypervisor and the L4 syscalls were upcalls to the firmware.) As it stands, hypervisors need to both implement upcalls for efficiency and also implement trap-and-emulate functionality for OS kernels that aren't hypervisor-aware. The trap-and-emulate portions of the code are both lower performance and more complicated than the upcall handlers. Both hypervisors and OS kernels would be simpler if the platform guaranteed a hypervisor is always present.

Always having a firmware hypervisor also allows pushing even more complexity out of hardware into the firmware. The Alpha had a single privilege bit indicating if it was currently running in firmware/hypervisor (PALCode) mode, and the firmware could emulate an arbitrary number of privilege levels/rings. The Ultrix/Tru64 Unix/Linux firmware just emulated kernel and user modes, but the OpenVMS firmware emulated more levels/rings. x86's 5 rings (including "ring -1/hypervisor) could be efficiently emulated by hardware that only implements ring -1 (hypervisor) and ring 3 (user mode).

Edit: Taken to an extreme, you get something like Transmeta's Crusoe that pushed instruction decoding and scheduling into a firmware hypervisor JIT that works on the processor's microcode level. In retrospect, it seems that Crusoe went too far, at least as far as early 2000's technology could go. However, there's still plenty of optimization space in between the latest Intel processors on the extreme hardware complexity side and Transmeta's Crusoe on the extreme firmware complexity side.

Edit 2: In-order processors like (at least early) Intel Atom, P.A. Semi's PWRficient, and Transmeta's Crusoe tend to be more power-efficient. If the architecture designed for it, I could see a case for limited out-of-order hardware capability with hardware tracing and performance counters/reservoir sampling of instructions that caused pipeline stalls. The firmware could then use run-time information to JIT re-order the instruction streams in hotspots that weren't well-served by the hardware's limited out-of-order execution capacity. This might be a viable alternative to ARM's big.LITTLE, where the firmware (or kernel) kicks in to provide a performance boost to hotspots when plugged in, and executes as a simple in-order processor when lower power consumption is desired, without the extra complexity of separate pairs of cores for performance and efficiency. Hardware sampling of which speculations work out and which are wasted would presumably guide the firmware's attempts to efficiently re-optimize the hot spots.

3 comments

> I wish ARM64, RISC-V, and other modern ISAs had taken DEC Alpha AXP's idea of restricting all of the privileged instructions to the firmware

This is already possible on RISC-V to some extend, by trapping privileged instructions into upper privileged modes. Everything in the ISA is made so it may be achieved cleanly. It also does not allow to detect current privileged mode, so the kernel running in U-mode and trapped on each privileged instruction would never know it's actually not in S-mode.

There is even a software-based hypervisor extension emulator based on that, that brings KVM to non-hypervisor-capable HW: https://github.com/dramforever/opensbi-h

Right, but it's cleaner and better performant to use higher-level upcalls to the hypervisor rather than trapping and emulating every privileged instruction.

As it stands, the hypervisor needs to implement both trap-and-emulate and upcall handlers, and OSes need to implement both running on bare metal and (if they want to perform well on hypervisors) hypervisor upcalls.

If you want your hypervisor to support nested hypervisors, then I guess you'd still need to implement trap-and-emulate in the hypervisor to allow running a hypervisor on top. However, you at least remove the dual paths in the OS kernel if you just disallow the bare-metal case. This also allows a bit more flexibility in hardware implementation as you can change the hardware implementation and the instruction sequence in the hypervisor without needing to modify any legacy OS kernels.

>A while back, I read a Sun patent on not implementing some instructions and emulating them in the kernel's illegal operation trap handler. The whole patent seemed obvious to me, but I'm glad that it was patented and now expired, providing obvious prior art in any attempts to patent it today.

I understand opensbi runs in M mode, taking on that role among others.

> Both hypervisors and OS kernels would be simpler if the platform guaranteed a hypervisor is always present.

This would make an excellent RISC-V proposal!

The nice thing is that RISC-V already has the concept of the HART, you could have a supervisor hart that manages many virtual harts.