Hacker News new | ask | show | jobs
by gnode 2385 days ago
Bits are addressable, just not with a normal pointer. It would have been possible to have a special fat pointer for bits, similar to how C++ sometimes has fat pointers for member functions (depending on compiler implementation).

The restriction in C can only be explained as a limitation of the language itself -- although probably motivated by the implementation complexity it would require, for a niche use case.

1 comments

I didn't mention pointers regarding bits, I mentioned addressability - a bit cannot have an address (in any language I'm aware of), though of course you can have any number of ways of accessing it.
Pointers, as a language concept, don't have to correspond to the addressing schemes of the hardware or ISA. On some architectures instructions may only be able to address aligned whole words. Some microcontrollers (e.g. Intel MCS-51) feature bit-addressable memory. Apparently, there's a special __bit type supported by the Small Device C Compiler for using bit addressable memory on such devices, although I don't know if it has support for taking pointers to these.
They do not have to. But then it wouldn't be C, which by design has a straight forward and obvious mapping to the underlying machine.

For example, there are machines (some DSPs) that individual octects are not efficiently addressable and usually a C byte in these machines is 16 or 32 bita.

Pointers are very much a language concept and very much not an architecture concept. I enjoy this particular writeup that touches on some of the distinctions. Of particular interest is the fact that the C standard itself states that two pointers are not equivalent simply by virtue of having the same address value.

https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html

I also happen to very much enjoy this piece on how the C abstract machine has very little in common with modern architecture.

https://queue.acm.org/detail.cfm?id=3212479

This exchange was an enjoyable read. C was designed for portability because they had those PSP computers or whatever they were but the problem is that each had its own unique architecture, switch arrangement for operation and maybe even endianess. I don't know. The whole point of the matter was to make a computer language portable enough by a person's desire to write a compiler for the architecture. Why people do not like that I can not comprehend.
They don't have to, but they're commonly understood to refer to memory addresses, which, on most ISAs, are locations of octets.

Even if the ISA only allows word- or dword-aligned loads from memory, the addresses still typically enumerate bytes, not words or dwords.

Based on a quick summary of the MCS-51 that I googled up, it looks like its memory addressing scheme still assigns addresses to bytes, and has special operations that allow you to further specify a bit offset within that memory address.

> it looks like its memory addressing scheme still assigns addresses to bytes, and has special operations that allow you to further specify a bit offset within that memory address.

There are also instructions which use an addressing scheme which takes an 8-bit bit address, with the 0x00 - 0x7f corresponding to lower memory, and 0x80 - 0xff corresponding to 16 specific registers in the Special Function Register set.

The 8051 has bit addressable memory.
Isn't a byte supposed to correspond to the smallest addressable unit of memory?
The original usage of the term "byte" was to refer to fields of variable length consecutive bits on a bit-addressable machine: https://en.wikipedia.org/wiki/Byte#History

Nowadays a byte is conventionally eight bits, especially for measures like "megabyte", but the term octet is often used to avoid ambiguity. Commonly they're used for pointers, yet often only words are addressable by machine instructions (e.g. many ARM instructions take a byte address yet raise a hardware exception on use of unaligned addresses).

Interesting, but I think the notion of a byte in C is different. But I'm not able to look it up at the moment.
It is, but it's defined rather weirdly:

"byte: addressable unit of data storage large enough to hold any member of the basic character set of the execution environment"

Hence why the type that corresponds to it is "char"! Beyond that, the only thing that kinda sorta implies that it's the smallest addressable unit is the definition of CHAR_BIT:

"number of bits for smallest object that is not a bit-field (byte)"

It may well vary depending on which C standard you're talking about. ISO C defines both a byte and a char as at least long enough to contain characters "of the basic character set of the execution environment". They must be uniquely addressable. Although it seems their definitions don't preclude them from being different, or from sub-bytes being uniquely addressable by pointers.