Hacker News new | ask | show | jobs
by int_19h 2377 days ago
C definitely doesn't require bytes to have 8 bits - it only requires them to have at least 8 bits. And there are architectures on which C char has as many bits as int (SHARC).

The question, though, was about whether it's the minimum addressable unit of memory. In the C memory model, it is, but by implication - you can't have two pointers that compare non-equal, but differ by less than 1, so a type with sizeof==1 is by definition the smallest you can uniquely address. However, the C memory model doesn't have to reflect the underlying hardware architecture.

1 comments

SHARC has no such requirement. Having char and int the same size was not universal. The CPU vendor shipped such a compiler, but that was not the only compiler.

The CPU itself used 32-bit addresses to access machine words, the size of which was determined by what was being accessed. External memory was limited to 32-bit. Internal memory had regions that could be 32-bit, 40-bit, or 48-bit. An address increment of 1 would thus move by that many bits.

Mercury Computer Systems shipped a byte-oriented port of gcc. Pointers to char and short were rotated and XORed as needed to reduce incompatibility. Pointers to larger objects were in the hardware format. This allowed a high degree of compatibility with ordinary software while still running efficiently when working with the larger objects. There was also a 64-bit double, unlike the 32-bit one in the other compiler. Data structures were all compatible with PowerPC and i860, allowing heterogeneous shared memory multiprocessor systems.

You can implement byte addressing on any architecture, of course. That's what I meant by "the C memory model doesn't have to reflect the underlying hardware architecture". But as you point out yourself, this requires pointers which are basically not raw hardware addresses, and which are more expensive to work with, because they require the compiler to do the same kind of stuff it has to do for bit fields. So the natural implementation - with no unexpected perf gotchas - tends towards pointers as raw hardware addresses, and thus char as the smallest unit those can address.