Hacker News new | ask | show | jobs
by detrites 1106 days ago
There are several different types of CPU's, in two main classes, CISC and RISC. The difference is summarised by the first letter - "Complex" vs. "Reduced" - Instruction Set Computer. Or, what size "vocabulary" a CPU decodes.

RISC-V is a type of CPU architecture (a set of plans for how to build one, not an actual CPU itself), that also happens to be open source. Anyone can build a RISC-V CPU without having to buy the rights to do so. (Many are.)

This project is an emulation of a RISC-V CPU. A kind of virtual "reference" CPU in software. It can be used to compile code that can run on a RISC-V type CPU, and to help understand what's happening inside the CPU when it runs.

It's written in C, which is and was a very fundamental programming language that's influenced the design of many other languages. It is a language that is very close the fundamental language CPU's natively decode and process.

CPU's natively use a language referred to as "Assembly", but which actually has many varieties particular to each CPU design. Regardless of variety of CPU, assembly is usually is about as reasonably "close to metal" as it gets.

It's literally communicating with the CPU directly in its own language. This makes it extremely fast to run, but laborious to code, and also somewhat "dangerous" in that with such low-level control, it's easy to mess things up.

This project takes an input of a text list of RISC-V assembly instructions (a "program") and pretends to be RISC-V CPU with those instructions loaded into it and being run on it. Useful for understanding, prototyping and building a RISC-V program.

CPU's are designed rather to run assembly that already "works", having been created programmatically (compiled or interpreted), by a higher level language that isn't going to give it things that make no sense (hopefully).

So there is not usually a lot of provisioning done in the design of the CPU to make it easy to watch it and its state carefully at a low level and examine how your assembly program is working, or not working. Emulation eases this.

3 comments

> CPU’s natively use a language referred to as “Assembly”, b

Strictly, CPUs use machine code. Assembly targeting a particular CPU is a very thin more-human-readable abstraction around the underlying machine code, but it is not, itself, what the CPU executes. That’s why “assemblers” exist – they are compilers from assembly language to machine code (though, because assembly is a very thin abstraction, they are much simpler than most other compilers.)

Agree. And deeper than that may be microcode, which we rarely see or reason about, and while may very much be there is rarely of practical use. (Ie, when learning, the distinctions may be somewhat an impediment without payoff.)
Would calling "Assembly" a CPU's frontend language be correct?

The same way as it is in compilers

This is a well writed explanation!

I wrote a short news about the emulator on the french collaborative website linuxfr.org (see https://linuxfr.org/news/un-emulateur-et-un-desassembleur-ri...)

I would like to translate your comment and add it. Can I ?

Wow this is beyond helpful. Thank you so much for taking the time to explain this so thoroughly.