This is really sad. The world is heading to a duopoly x86 - arm. Alpha is dead, Mips is almost dead, PA-RISC is dead, POWER is too expensive and RISC-V is mostly nice to have.
A lot of these architecture have some drawbacks in modern times.
Alpha’s loosey-goosey memory model makes multithreaded code on SMP systems more challenging. Linux utilizes its Alpha port as a worst-case testbed for data race conditions in its kernel.
SPARC’s register windows are anachronistic and complicate the implementation of CPUs, and I’d guess also make it more difficult to build OoOE cores (so many SPARC chips are in-order, why?)
POWER isn’t so bad though. It’s open enough where you could build your own lower-cost core if you wanted. There’s nothing intrinsic to the ISA that would mandate an expensive chip other than volume constraints.
PA-RISC put up some great numbers back in the day but between the Compaq acquisition (bringing with it Alpha) and Itanium it was chronically under-resourced. They had a great core in the early 90s and basically just incrementally tweaked it until its death.
I really liked PA-RISC. I thought it was a clean ISA with good performance at the time and avoided many of the pitfalls of other implementations. I think HP didn't want to pour lots of money into it to keep it competitive, though, and was happy to bail out for Itanium when it was viable. My big C8000 is a power hungry titan, makes the Quad G5 seem thrifty.
IDK, I never really liked PA-RISC, but to be fair I was always able to look at it from a hindsight perspective. Looking back it seems to have most of the RISC issues that complicate modern ISA design. Like branch delay slots, having a multiply instruction wasn't RISCy enough for it to bother with, etc.
...and MIPS has the weird branch delay slots as well as pretty horrible code density.
If you look at ARM, particularly the 64-bit version, you'll notice it attempts to squeeze multiple operations into a single 32-bit "instruction". It's still called RISC, but not really "reduced" anymore.
Not sure anyone sees "pure" RISC as being an advantage these days though. Didn't Intel demonstrate that you could get RISC-like performance from a CISC ISA even with all the drawbacks of x86 (instruction decoding complexity etc).
Not Linux but the Linux formal memory model. The idea is that the compiler optimizations can be as nasty as the Alpha out of order execution engine and cache.
The Linux code has to cater for these optimizations even though it will not result in an actual assembly instruction on anything except the Alpha. Problem is, on Alpha there's indeed an actual price to pay in performance for that nastiness.
I suspect you misunderstand the question. My question is if anybody is presently using alpha hardware to verify such correctness. I understand memory models and barriers etc. and that alpha is one of the most relaxed on this front, that it historically influenced the kernel code and was previously very important test hardware. But the hardware is now very dated, to the point where it might not be good test hardware.
The answer to that question is no, but the Alpha is still considered the least common denominator even though the hardware is obsolete. When people write litmus tests for the Linux memory model they are still validated against Alpha semantics, because compiler optimizations have the same reordering effects as the weird caches of Alpha processors.
(The stroke of genius of the C++11 memory model, compared to the older Java memory model, was that reordering could be treated the same way no matter if performed by processors or compilers).
I know that GCC for example is tested on a bunch of wacky old - read: obsolete ;) - hardware, so it's certainly possible that the same is true for Linux.
It's mildly interesting to me that there's now really no notable big-endian systems left, yet that's still the network byte order. I wonder what the math is for the amount of global wasted CPU cycles on byte-swapping for things that would do a fair amount of that...DNS for example.
Though as we recently learned, it's considered sufficiently "fringe" by a big chunk of the development community that it's not that big a deal to drop support for it. (Not to imply IBM couldn't be sponsoring development for it more).
If you're talking about the python cryptography fiasco that was dropping support for S390 (31-bit architecture discontinued in 1999). S390X (64-bit architecture introduced in 2000) is supported by Rust, though not necessarily by Python Cryptography.
Incidentally Rust's continued support for S390X is driven primarily by cuviper who works for Red Hat (even before the IBM acquisition).
Power ISA systems are still bi-endian and many systems run big. In fact, the low level OPAL interfaces require you to be in big-endian mode, AIX and i are still BE, and both FreeBSD and OpenBSD have BE flavours for current PowerNV systems. Even a few Linux distros run big (Adelie comes to mind). They're definitely a minority but they're still around.
Power ISA includes an endianness switch in the spec. Power and Power64 are all BE and LE. Most Linux distros only support modern versions on LE though. Debian has a BE port but it's not considered a primary release target.
The last PPC64 release of Ubuntu was 16.04 which is now out of support by about a month. Even on that, the two major web browsers didn't support building on the platform for a long time.
Yes, I wasn't claiming that no big endian systems exist. Just that they are overwhelmingly in the minority now, and so the number of ASM byte swapping ops happening is mildy amusing.
IBM mainframes are big endian essentially because punch cards are big endian.
(Punch cards are big endian because the number 123 is punched as "123". So that's the order a decimal number will be stored in memory. The System/360 mainframes (1964) had a lot of support for decimal numbers and it would be kind of bizarre to store decimal numbers big-endian and binary numbers little-endian so everything was big endian. IBM's current mainframes are compatible with S/360.
On the other hand, in a serial computer, you operate on one bit at a time, so you need to start with the smallest bit for arithmetic. The Intel 8008 was a copy of a serial TTL computer, the Datapoint 2200, so it was little-endian. x86 is based on the 8008, so it kept the little-endian architecture.)
I know of at least one micro arch that'll fuse load and byte swap instructions into a reverse endian load. There's still probably a detectable overhead, but it's not the end of the world to hack on later.
There are various open-source and white-box network switches and routers - do any of them run big-endian? If not, it must be a solved problem (perhaps by fast-path dedicated ASICs).
Increasingly things seem to be moving towards ASICs for switching and general purpose CPUs (usually with a lot of support from the NIC offload capabilities) for routing, even in 'real' networking hardware.
The vast majority of fabric ASICs would never actually utilize additional TCAM necessary to support full tables at line rate in hardware because top of rack switches do not have that many addressable targets, so it's a wasted cost.
And with DPDK optimized software implementations are achieving zero drop line rate for even 100G+ interfaces for much, much lower cost than full table routing ASICs married to fabric ASICs in a chassis switch.
It's not something a lot of users are aware of -- they often think they've bought an ASIC-based router! -- but essentially all of the big vendors entry and mid-level devices are software routers, and they're even trying to figure out how to sell their NOS experience on whitebox hardware without undercutting their branded hardware.
> It's not something a lot of users are aware of -- they often think they've bought an ASIC-based router! -- but essentially all of the big vendors entry and mid-level devices are software routers, and they're even trying to figure out how to sell their NOS experience on whitebox hardware without undercutting their branded hardware.
To be fair [to you], my original claim is a bit of a tautology as I don't really consider software/CPU based CPE gear to be 'real' networking.
I should be more specific. High radix switches/routers are, unequivocally, not built out of CPUs and software, period. To the point of the original discussion, these concentration points are the only place that byte order overhead would be significant. Others in this thread claim it's not significant even in CPU implementations due to optimized instructions, but I personally can't opine on that.
Last NPU I worked with (admittedly 10+ years ago) was little endian! It used load/store-swapped instructions. (Why? I can only guess that they licensed a little-endian CPU core for other reasons.)
Apple has been using ARM on and off since 1993. They have more long term organizational experience with ARM than they did with x86.
The Newton, then the iPod, then the iPhone, and now the M1.
The iPhone is a more important device for Apple than the Mac from a revenue point of view, and they've sold more devices with ARM chips in them than they have 68k, PowerPC, or x86. They've sold 2.2 billion iPhones. I can't find an easy number on how many Macintoshes they've sold totally, but I can't imagine it's close to that.
In fact, they used ARM in the Newton (1993) before they used PowerPC in the Power Mac (1994).
They sold their 40% stake in ARM when they were short of cash.
Switch from Power PC to Intel, and then from Intel to ARM. I'm using Apple as a tipping point, to when the new architecture was so much better than the old it completely took over. Obviously with 90% of Apple devices being ARM already it was an easier choice for them this time. But as each Architecture gets more power as the market is many times bigger, it may be more difficult for the new entrant.
That's why RISC V's win (if it occurs) will be because it's Open Source. Linux won in 30 years against everyone else due to that.
On the Apple specific case I think any move to RISC V would be because it would want more control than it has with Arm. It could then take the RISC V ISA in the direction it wants.
I'm guessing it already has a lot of influence over Arm though and there are other factors that strongly act in favour of staying with Arm.
If Nvidia takes over Arm though and starts making life difficult for the ecosystem then that could change ....
Apple is really interesting, with chip design being moved inhouse and the ease of which they seem to switch architecture they could move away from ARM if the Nvidia purchase happens. I think they’d want to avoid it, at least for the next 10 years.
It would be interesting to know how important the ARM instruction set is to Apple.
I wouldn't be completely surprised if there is a box running a build of Mac OS for RISC V somewhere in Cupertino!
Seriously though, I suspect that the ISA isn't that important for Apple but on the other hand I think they're probably quite happy with the direction of the Arm ISA (probably had a big say in parts of it) and it would take quite a lot to push them away.
I think that the odds on the Nvidia takeover are quite small by now so don't think a move likely at all.
Given Apple's history, and their business style, I don't think they have loyalty to any architecture or any specific technology in particular. They're care about product first, and choose whatever technology they need to choose to get there. https://youtu.be/oeqPrUmVz-o?t=113
Pretty sure Apple has a permanent ARM license. They'll watch what happens with Nvidia, but it doesn't really affect them because, as you say, all the secret sauce is in-house.
Apple switched from Motorola 68k to PowerPC, too, and Sun switched from 68k to SPARC. The Amiga, NeXT, early Palm devices, and the ST were also using members of the 68k family. That's an ISA born in 1979 and largely replacing (and inspired by) the 6800 (1974) which had a 16-bit address bus and 8-bit memory bus and its (binary incompatible but with the same assembly language) little brother the 6809 (1978). The Tandy Color Computer and the Dragon were notable 6809 systems.
That, of course, is just with the Mac since Apple previously used variants of the MOS 6502 (1975 and allegedly an illicit clone of the MC6800). Apple, Atari, Acorn, Commodore (the owner of MOS for several years), BBC, Oric, and Nintendo used it in multiple systems each. Apple, Acorn, and Nintendo built additional systems on its updated sibling the WDC65816 series (1983).
The the 6800/6809/Hitachi 6300/68k/Dragonball/Coldfire dynasty and the bastard MOS6502/WDC65816 families were collectively basically the ARM of their day in a way. Everyone targeting low priced or power-sipping was building platforms around them at one time or another. Acorn went from a customer to a major competitor and successor.
It should be noted that the PowerPC and the whole POWER ISA multi-platform family was largely inspired by Apple in the first place. They were talking to IBM about a new platform and invited Motorola to the talks as their long-time processor provider. They formed the "AIM Alliance" that eventually morphed into the POWER Foundation and OpenPOWER initiatives. I can't really speak to how much of POWER ISA is inspired by Motorola's own "RISC" processor, the 88000 series.
Yes, but where can I buy a SPARC CPU? How many of those who have/can have it are running Illumos and are putting money/time in it? And more importantly what's the outlook for SPARC?
You can buy them used or new in various kind of servers.
> How many of those who have/can have it are running Illumos and are putting money/time in it?
Dunno, I'm not really a Solaris guy. I use Solaris as a hypervisor for Linux and BSD LDOMs.
> And more importantly what's the outlook for SPARC?
Well, you could make the very same argument about Illumos. The Python developers wanted to drop support for Solaris already and OpenJDK upstream did actually drop it.
For illumos, the sweet spot is the 10-15 year old Sun gear you can pick up on eBay. Works well, supported, not overly expensive.
Newer SPARC systems are really quite good. And pretty cost-effective too. The problem is that the starting price is out of reach, and almost nobody is offering a cloud service based on SPARC, so you can't hire it either.
I'm running illumos on SPARC. I have some old hardware (desktop and server) that I like to make use of. Time, yes, but I'm not putting money into it.
And while OpenJDK upstream has dropped support for SPARC and Solaris, that was really all about problems with the Studio compiler. I' maintaining an illumos OpenJDK port with the gcc toolchain on x86 - it's not excessively hard, and realistically if you're using a common toolchain and common CPU most standards-compliant code is portable at the OS layer.
Alpha’s loosey-goosey memory model makes multithreaded code on SMP systems more challenging. Linux utilizes its Alpha port as a worst-case testbed for data race conditions in its kernel.
SPARC’s register windows are anachronistic and complicate the implementation of CPUs, and I’d guess also make it more difficult to build OoOE cores (so many SPARC chips are in-order, why?)
POWER isn’t so bad though. It’s open enough where you could build your own lower-cost core if you wanted. There’s nothing intrinsic to the ISA that would mandate an expensive chip other than volume constraints.
PA-RISC put up some great numbers back in the day but between the Compaq acquisition (bringing with it Alpha) and Itanium it was chronically under-resourced. They had a great core in the early 90s and basically just incrementally tweaked it until its death.