Hacker News new | ask | show | jobs
by tyingq 1873 days ago
It's mildly interesting to me that there's now really no notable big-endian systems left, yet that's still the network byte order. I wonder what the math is for the amount of global wasted CPU cycles on byte-swapping for things that would do a fair amount of that...DNS for example.
7 comments

> It's mildly interesting to me that there's now really no notable big-endian systems left

That's not correct. s390x is big-endian and well supported in all enterprise distributions such as SLE, RHEL as well as Debian and Ubuntu.

Though as we recently learned, it's considered sufficiently "fringe" by a big chunk of the development community that it's not that big a deal to drop support for it. (Not to imply IBM couldn't be sponsoring development for it more).
If you're talking about the python cryptography fiasco that was dropping support for S390 (31-bit architecture discontinued in 1999). S390X (64-bit architecture introduced in 2000) is supported by Rust, though not necessarily by Python Cryptography.

Incidentally Rust's continued support for S390X is driven primarily by cuviper who works for Red Hat (even before the IBM acquisition).

But s390x support isn't dropped anywhere. On the contrary, IBM spends a lot of money and efforts to make sure it is well supported by free software.
Notable in terms of global cpu capacity. Linux on zSeries is interesting, but only makes financial sense in some pretty limited scenarios.
Power ISA systems are still bi-endian and many systems run big. In fact, the low level OPAL interfaces require you to be in big-endian mode, AIX and i are still BE, and both FreeBSD and OpenBSD have BE flavours for current PowerNV systems. Even a few Linux distros run big (Adelie comes to mind). They're definitely a minority but they're still around.
Power ISA includes an endianness switch in the spec. Power and Power64 are all BE and LE. Most Linux distros only support modern versions on LE though. Debian has a BE port but it's not considered a primary release target.

The last PPC64 release of Ubuntu was 16.04 which is now out of support by about a month. Even on that, the two major web browsers didn't support building on the platform for a long time.

Yes, it can be done if you want enough to do it.

https://catfox.life/2018/11/03/clearing-confusion-regarding-... for more info

Yes, I wasn't claiming that no big endian systems exist. Just that they are overwhelmingly in the minority now, and so the number of ASM byte swapping ops happening is mildy amusing.
Many CPU's have load/store instructions that perform the network byte order swap with no/minimal overhead.

Serialization formats like JSON/YAML/protobuf/etc. would be much more costly by comparison.

IIRC ARM devices can also be big-endian and GCC can even generate big endian 64-bit ARM code:

https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

Yeah, you can find an ARM big endian distribution of, for example NetBSD. No Linux that I can find. Apparently boot issues are a bit tricky.
I have found a Gentoo distribution with big endian for the raspberry pi 3, so it is out there https://github.com/zeldin/linux-1/releases
I'm fairly sure that NetBSD/arm switches to big endian once the kernel is running, the boot process is unchanged.
The thing holding it back for Rpi4 is UEFI+ACPI, so I assume there's some boot process changes.
ACPI is problematic in big endian.
You used to be able to get debian arm-be, but that was a good 15 years ago.
IBM mainframes are big endian, all the Linux distros support them too.
IBM mainframes are big endian essentially because punch cards are big endian.

(Punch cards are big endian because the number 123 is punched as "123". So that's the order a decimal number will be stored in memory. The System/360 mainframes (1964) had a lot of support for decimal numbers and it would be kind of bizarre to store decimal numbers big-endian and binary numbers little-endian so everything was big endian. IBM's current mainframes are compatible with S/360.

On the other hand, in a serial computer, you operate on one bit at a time, so you need to start with the smallest bit for arithmetic. The Intel 8008 was a copy of a serial TTL computer, the Datapoint 2200, so it was little-endian. x86 is based on the 8008, so it kept the little-endian architecture.)

I know of at least one micro arch that'll fuse load and byte swap instructions into a reverse endian load. There's still probably a detectable overhead, but it's not the end of the world to hack on later.
There are various open-source and white-box network switches and routers - do any of them run big-endian? If not, it must be a solved problem (perhaps by fast-path dedicated ASICs).
> If not, it must be a solved problem (perhaps by fast-path dedicated ASICs).

Correct. The data plane of all 'real' networking is done in ASICs and/or NPUs.

Surprisingly less true these days.

Increasingly things seem to be moving towards ASICs for switching and general purpose CPUs (usually with a lot of support from the NIC offload capabilities) for routing, even in 'real' networking hardware.

The vast majority of fabric ASICs would never actually utilize additional TCAM necessary to support full tables at line rate in hardware because top of rack switches do not have that many addressable targets, so it's a wasted cost.

And with DPDK optimized software implementations are achieving zero drop line rate for even 100G+ interfaces for much, much lower cost than full table routing ASICs married to fabric ASICs in a chassis switch.

It's not something a lot of users are aware of -- they often think they've bought an ASIC-based router! -- but essentially all of the big vendors entry and mid-level devices are software routers, and they're even trying to figure out how to sell their NOS experience on whitebox hardware without undercutting their branded hardware.

> It's not something a lot of users are aware of -- they often think they've bought an ASIC-based router! -- but essentially all of the big vendors entry and mid-level devices are software routers, and they're even trying to figure out how to sell their NOS experience on whitebox hardware without undercutting their branded hardware.

To be fair [to you], my original claim is a bit of a tautology as I don't really consider software/CPU based CPE gear to be 'real' networking.

I should be more specific. High radix switches/routers are, unequivocally, not built out of CPUs and software, period. To the point of the original discussion, these concentration points are the only place that byte order overhead would be significant. Others in this thread claim it's not significant even in CPU implementations due to optimized instructions, but I personally can't opine on that.

Last NPU I worked with (admittedly 10+ years ago) was little endian! It used load/store-swapped instructions. (Why? I can only guess that they licensed a little-endian CPU core for other reasons.)