| > So a 32-bit integer is the product of 32 two-state bit types. Something akin to NonZero could be defined as that type minus one state, such that there are now 4294967296 - 1 representable values. I can see a couple of problem with this approach: - you need to be able to talk about bit-level types, which contradicts the common assumption that all types are addressable, i.e. whose size and alignment is some positive integer (or zero) number of bytes; - what if you want to substract more complex set of values? For example describing a 32 bit number without all multiples of 7? And how do you encode this in the compiler in such a way that type checking remains decidable? - how do you safely and concisely express conversions and operations on these kind of types? > Similarly, pointer types on some machines always have some bits set to 0 due to hardware constraints. These can be represented in the type system as 2^64 / 2^n where 'n' is the number of bits that are not usable, resulting in something like 2^46 for typical CPUs. This would allow extra bits of state to be "packed in there". Note that this is not forward compatible, since newer CPUs can start using more bits. In fact recent CPUs started using 57 bits instead of 48 bits, so a program that packed more state in those top bits would now be broken. You should generally instead try to pack data in the least significant bits that will always be 0 due to alignment constraints. Moreover the top bits are not always 0, they are equal to the most relevant bit of the used part of the address. On Linux this just happens to always be 0 for userspace addresses (and 1 for kernel addresses) but this won't be the case on all architectures and OSes. I also wonder how you would define these types using subtraction/division types such that they are different? - the address type being 64 bits but using only the least significant 48 bits and having the top 16 bits always 0
- the address type being 64 bits but using only the least significant 48 bits and having the top 16 bits always equal to the 48th bit
- the address type being 48 bits Clearly these types are all isomorphic, but they are definitionally different and in a way that really matters to the hardware. |
This would work best in a high level language where the specific bit layout of struct types is not defined (by default). Rust is one such language, but this would also work with .NET and JVM languages.
One approach is that integers are represented as A < ( x < S + O ) < B. This would allow ranges, powers of two, “NonZero”, offsets, and the like the be represented in the lowest levels of the type system. Additionally, the high level type system could also keep an additional list of specific numbers that are excluded.
Pointer types could be internally represented as ordinary integers, or aligned non-null pointers on some architectures would be “0 < x << 3”.
This could have no effect on the emitted code, but the compiler would be free to utilise the spare low bits or the non-zero value if it chose to do so. Rust does this in a few hard-coded scenarios, but a more complete type model would allow more flexibility. I.e.: it could pack the enum discriminator and the value in there if the enum has only pointer types as values.
Conversely, the high level language can use this for type checks to give better error messages.