| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gdwatson 3752 days ago
	I assume it's because the standard doesn't guarantee a linear address space.

2 comments

david-given 3752 days ago

The standard doesn't even guarantee that pointer arithmetic works at all, outside the bounds of an object (or at least did in C89, which is the only standard I know well).

If you have an object which is ten bytes long, like:

    uint8_t o[10];

...then it's legal to construct pointers to o[0] through o[10] --- not o[9]; you can create a pointer to the byte immediately after the object --- and nowhere else. Like, it's not even legal to calculate one, let alone dereference it.

I used this to make a prototype compiler from C to Javascript/Perl/Lua, where each C pointer was represented as a tuple of (array, offset). Pointer arithmetic worked inside objects; pointer arithmetic between objects wasn't supported. Worked nicely.

link

uxcn 3752 days ago

Accessing beyond bounds is undefined, but leaving pointer arithmetic outside of objects undefined would preclude a lot. For example, building an OS page table or DMA, or RDMA, or MMIO, etc...

link

pcwalton 3752 days ago

No, even out-of-bounds arithmetic is undefined. 6.5.6 Additive operators paragraph 8 (cribbed from Stack Overflow [1]):

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

[1]: http://stackoverflow.com/questions/18186987/decrementing-a-p...

link

uxcn 3752 days ago

This doesn't say anything about pointer arithmetic on pointers to raw memory though. For example, using an mmap file, there isn't any object, and there aren't any bounds.

link

david-given 3751 days ago

That's correct --- the standard doesn't say anything about raw memory. Only pointers to objects defined by C are defined to work; everything else is an implementation-specific extension (and so, covered under 'undefined behaviour').

I believe that C99 added the ability to losslessly cast from a pointer to a uintptr_t and back again, but, IIRC, the compiler didn't have to support this in C89.

link

brandmeyer 3752 days ago

These cases do work if you are operating on (uintptr_t)&object instead. The C machine model is more restrictive than the von Neumann machine model supported by the likes of x86 and ARM.

link

uxcn 3752 days ago

Non-linear address spaces shouldn't affect pointer arithmetic though, unless I'm misunderstanding something. Otherwise, I guess the implication is that there are systems or implementations that rely on negative pointers; in which case I would think it should be up to the implementation.

link

_kst_ 3752 days ago

Consider a system where an address consists of a segment identifier combined with a byte offset within the segment. The relationship between different segments is unspecified. Pointer equality has to consider both parts of the address, but "<", "<=", ">", and ">=" can ignore the segment identifier and compare only the offsets.

Given two distinct objects, x and y, (&x == &y) is meaningful, but (&x < &y) isn't particularly. (Except that sometimes it would be convenient to have a consistent total ordering on addresses, something that C doesn't define.)

link

uxcn 3752 days ago

I can see that a non-linear address space doesn't imply a strict weak ordering, but this still seems to be an implementation defined detail. It doesn't imply anything about pointer arithmetic overflow.

For example, consider x86 segments. Is there a reason why you would use negative offsets? Given a segment address, the number of representable values is identical whether the offset is strictly positive, or negative with a shifted segment address (assuming two's compliment).

link