Hacker News new | ask | show | jobs
by strooper 2242 days ago
May be its days are numbered, but MIPS still has significant presence in the low to medium end routers (Mediatek's MT72xx, Atheros AR7xxx so on).
4 comments

MIPS still has a very large presence in the education sector as well - MIPS asm is very easy to learn because of how consistent it's syntax is and the plethora of open source debuggers as well.

Learning MIPS was what originally got me interested in ASM programming since we had a class that was focused on MIPS code and another class that had us build a digital MIPS processor from scratch. The combination of these two classes really sold me on the magic of super low-level programming.

For those who want to take a look, "Computer Organization and Design" by Hennessy and Patterson is a very common textbook on the subject (at least here in Italy). I found reading it to be a very, very instructive experience.

Its version of 32-bit MIPS is so simple, its whole instruction set fit in a 2-side cheatsheet (the famous "green sheet"). The design of the CPU is quite easy too. Given an instruction and its binary representation, it is almost straightforward to see how each bit contributes to the computation (setting the correct ALU operation, retrieving a value from the correct register, etc.).

Also, Microchip’s PIC32 microcontrollers, which are still around, have a 32bit MIPS ISA.
Yep, I also learned MIPS in my CS class. It's pretty nice to program in.
MIPS is typically seen in computer architecture class because it is simple and regular (at least the original version seen in class). However, for Assembly programming, MIPS, like (almost?) all RISC instructions sets, is tedious. Give me any CISC with a generous range of addressing modes, and I take any day over MIPS/RISC.

We can make a parallel between those low-level ISA and high-level languages: a language like Lisp is lean and simple so it is taught and presented as good design (and people who went through that education keep that in memory), but when it comes to produce real program almost everybody chooses a much less regular language, which is way more practical. (Same could be said for stack-based languages like Forth, which present an extremely simple model to apprehend, but that doesn't mean at all that it is simple to program in.)

Or postfix vs infix for mathematical expressions/calculations. Same principle: the one which is based on a very simple model is praised by aesthetes, but almost everybody prefers the other one, which is simpler to use because it is more natural, despite being based on a more complex model.

In fact, the simplicity of the model is not of much interest for the user, it just makes the life of the implementer easier. But for 1 implementer, there are thousands or millions of users, who want ease of use, not ease of implementation.

Traditional addressing modes have been largely abandoned because modern architectures are based on the load-store principle. Simplicity has little to do with it, and referring to that whole shift in design as "complex" vs. "simple" instruction sets is a bit of a misnomer. Besides, well-designed architectures are not exactly lacking in ease-of-use.
This is only "sort of" true.

First, mod-r/m addressing on x86 is fairly traditional and can often save considerable calculation over a "simpler" addressing mode (given the opportunities for add-and-scale operations).

Second, treating x86 machines as load/store architectures passes up the opportunity to achieve improved code density and increased execution bandwidth from "microfusion" - this is when a operation (e.g. "add") is done with a memory operand. Microfusion, for those not familiar with it, allows two "micro-ops" (aka uops) that originate from the same instruction to be "fused" - that is, issued and retired together (even though they are executed separately).

This can occasionally - in code that has already been militantly tuned to an inch of its life - yield speedups, as Skylake and similar can only issue and retire 4 uops per cycle. However, there are 8 execution ports (of which only 4 do traditional 'computation'). Carefully designed code can take advantage of the fact that issue/retire are in the "fused domain" while execute is "unfused domain" - so you can sometimes get 4 computations and 1 load per cycle even on a 4-issue machine.

I was trained on MIPS and Alpha, so of course old habits die hard, and it's always tempting to go old school and design everything to act as if the underlying machine is a load-store architecture. However, this (a) isn't necessary on x86 and (b) often won't be faster.

The other blow against load-store is that a modern o-o-o architecture can hoist the load and separate it from the use anyway - and it doesn't have to consume a named register to do it (it will use a physical register, of course, but x86 has way more physical registers than it has names for registers). This of course is a bigger deal for the rather impoverished register count of x86 so it is, in the words of a former Intel colleague on a different topic, a "cure for a self-inflicted injury".

Well I don't think most people these days are interacting with assembly by hand-writing programs. The real users of ISAs now are compiler authors, and the simpler and regular languages seem better for them. So is there some other reason (other than inertia) that RISC isn't practical?
I think this is what VLIW exposed. Real world optima are a union of hardware, microarch, compiler, and software optima.

Even a global optimum for one is unlikely to be an efficient solution for all.

> Give me any CISC with a generous range of addressing modes, and I take any day over MIPS/RISC.

Or how about a macro assembler where you can do that with custom pseudo-instructions?

It’s nice until you get to its annoying and fairly irrelevant pipelining model.
The ubiquitous delay slots in MIPS are one instruction-set feature that has aged really badly. RISC-V actively got rid of it in their design because it ends up being such a hindrance to, e.g. out-of-order implementations.
It's also a hindrance to in-order implementations that have a different number of branch delay cycles (e.g. different number of pipeline stages or instructions taking a variable number of cycles) than the original implementation.

Branch delay slots were a somewhat clever solution to reduce the complexity of the original implementation, but they baked implementation details into the ISA and became problematic when the implementation details changed.

> they baked implementation details into the ISA and became problematic when the implementation details changed.

Same reason why stuff like VLIW has failed to catch on. These things are so dependent on specific hardware implementation details that one can hardly call them general-purpose ISA's anymore.

Can confirm, I was taught MIPS assembly in a computer architecture course as recently as 2 years ago.
I have a crappy nas that is running a MIPS cpu. Basically the only thing I can run on it is Debian because every other distro seems to have dropped support.
And I was surprised to find out Wyze brand cameras are MIPS, too! Lots of embedded systems likely still using MIPS.
Those are really starting to be eaten by ARM over the past few years.