The standard doesn't even guarantee that pointer arithmetic works at all, outside the bounds of an object (or at least did in C89, which is the only standard I know well).
If you have an object which is ten bytes long, like:
uint8_t o[10];
...then it's legal to construct pointers to o[0] through o[10] --- not o[9]; you can create a pointer to the byte immediately after the object --- and nowhere else. Like, it's not even legal to calculate one, let alone dereference it.
I used this to make a prototype compiler from C to Javascript/Perl/Lua, where each C pointer was represented as a tuple of (array, offset). Pointer arithmetic worked inside objects; pointer arithmetic between objects wasn't supported. Worked nicely.
Accessing beyond bounds is undefined, but leaving pointer arithmetic outside of objects undefined would preclude a lot. For example, building an OS page table or DMA, or RDMA, or MMIO, etc...
No, even out-of-bounds arithmetic is undefined. 6.5.6 Additive operators paragraph 8 (cribbed from Stack Overflow [1]):
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
This doesn't say anything about pointer arithmetic on pointers to raw memory though. For example, using an mmap file, there isn't any object, and there aren't any bounds.
That's correct --- the standard doesn't say anything about raw memory. Only pointers to objects defined by C are defined to work; everything else is an implementation-specific extension (and so, covered under 'undefined behaviour').
I believe that C99 added the ability to losslessly cast from a pointer to a uintptr_t and back again, but, IIRC, the compiler didn't have to support this in C89.
These cases do work if you are operating on (uintptr_t)&object instead. The C machine model is more restrictive than the von Neumann machine model supported by the likes of x86 and ARM.
Non-linear address spaces shouldn't affect pointer arithmetic though, unless I'm misunderstanding something. Otherwise, I guess the implication is that there are systems or implementations that rely on negative pointers; in which case I would think it should be up to the implementation.
Consider a system where an address consists of a segment identifier combined with a byte offset within the segment. The relationship between different segments is unspecified. Pointer equality has to consider both parts of the address, but "<", "<=", ">", and ">=" can ignore the segment identifier and compare only the offsets.
Given two distinct objects, x and y, (&x == &y) is meaningful, but (&x < &y) isn't particularly. (Except that sometimes it would be convenient to have a consistent total ordering on addresses, something that C doesn't define.)
I can see that a non-linear address space doesn't imply a strict weak ordering, but this still seems to be an implementation defined detail. It doesn't imply anything about pointer arithmetic overflow.
For example, consider x86 segments. Is there a reason why you would use negative offsets? Given a segment address, the number of representable values is identical whether the offset is strictly positive, or negative with a shifted segment address (assuming two's compliment).
If you have an object which is ten bytes long, like:
...then it's legal to construct pointers to o[0] through o[10] --- not o[9]; you can create a pointer to the byte immediately after the object --- and nowhere else. Like, it's not even legal to calculate one, let alone dereference it.I used this to make a prototype compiler from C to Javascript/Perl/Lua, where each C pointer was represented as a tuple of (array, offset). Pointer arithmetic worked inside objects; pointer arithmetic between objects wasn't supported. Worked nicely.