Hacker News new | ask | show | jobs
by akvadrako 3545 days ago
I would tend to disagree. For example, casting void * to other pointer types is undefined, but this construct is often used, for example:

  void func(void *ptr)
    {
    uint32_t *ip = ptr;
    ptr[0] = 123;
    }
3 comments

This is wrong. Casting to and from void * is defined for all pointer types except function pointers, according to the standard.

Otherwise most every assignment after call to malloc would be undefined.

Yes, you are right, void * is an exception. However, any other pointer cannot be reliably casted:

From C1X, section 6.3.2.3:

"A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined."

Though that is quite odd, since any pointer can be converted to void* , which only needs alignment to the char type. So converting from x* -> y* is undefined, but x* -> void* -> y* is defined.

That might not necessarily work. If you have something like:

>> uint8_t x[100];

>> uint32_t *y = &x[1];

And then dereference y, most RISC architectures will trap on the unaligned access. It doesn't matter if there is an intermediate void pointer or not.

I am not trying to say it'll work, I'm trying to show that most non-trivial C programs invoke undefined behaviour, according to the spec.

According to my reading an intermediate void pointer allows the pointer casting to stay well defined. However this seems unsafe, even without getting into dereferencing, because implementations are allowed to store omit bits if they assume aligned pointers.

I'd say my example demonstrates the spec's statement:

"A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined."

The resulting uint32_t pointer in my example is not correctly aligned for the reference type, so undefined behavior (e.g., a trap on RISC) occurs. What's an example of a statement in a "non-trivial" C program that is in common use but you think is undefined?

Okay, now there are two different topics.

(1) I didn't say your example didn't demonstrate a violation, but it misses the point, because it doesn't invoke an intermediate void pointer:

"A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer."

(2) That was my attempt at coming up with a good example, but it seems, due to the above clause, the casting between incompatible pointers via void * is technically "legal".

To try and give some actual examples:

Undefined:

  int64_t a = 42;
  void* p = &a;
  int32_t* i = p;
  printf("%i", *i);
Implementation defined, as type punning to char is legal (allowing the implementation of memcpy):

  int64_t a = 42;
  void* p = &a;
  char* ch = p;
  printf("%c", *ch);
Exercise left to the reader: Implement a "fast" memcpy (e.g. one that will copy more than 1 byte at a time for large copies, as your standard library implementation likely does) without violating strict aliasing rules.
If you try and hit your finger with a hammer, your subsequent behavior is undefined. Please do not do that.
Where in the standard does it say your first example in undefined?
Since I don't have a copy of the C standard handy, I'll reference this which covers the relevant sections of C++03, C++11, C99, and C11: http://stackoverflow.com/a/7005988/953531 . Quoting the C99 version bellow (§6.5 ¶7):

  An object shall have its stored value accessed only by an lvalue expression that has one of the following types 73) or 88):

  * a type compatible with the effective type of the object,
  * a qualified version of a type compatible with the effective type of the object,
  * a type that is the signed or unsigned type corresponding to the effective type of the object,
  * a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  * an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  * a character type.

  73) or 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
Bullet 6 is what allows the second sample to have defined behavior. For the first sample, unless I'm seriously mistaken, int32_t isn't considered "a type compatible with" int64_t. Bullet 2 talks of "qualified" versions of types - I believe this is referencing const/volatile qualified types. Bullet 3 apparently allows you to type pun (unsigned int) to (signed int) or vicea versa? Which is an interesting bit of new trivia to me. Bullet 4 is much of the same, bullet 5 requires a nonexistant union, and bullet 6 requests a character type.
Okay, good point - so it's the deferencing step that evokes the clause your mentioned. Apparently, what I learned today, is the cast is fully legal even though it could produce an invalid pointer.

I still wonder if my snippet counts as undefined behavior, since it does dereference an "unknown" void pointer, which may have come from an incompatible object type.

BTW, the latest C1X draft is only at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

> I still wonder if my snippet counts as undefined behavior, since it does dereference an "unknown" void pointer, which may have come from an incompatible object type.

It counts as potentially undefined behavior - depends on what you pass in. NULL? UB. Pointer-to-uint64_t? UB. Pointer-to-uint32_t? Perfectly defined behavior! ...well, assuming we use ip[0] = 123; instead of ptr[0] = 123;, which won't compile as I've just noticed.

That said, there are some ways to construct pointers which are in and of themselves undefined behavior for merely constructing the pointer:

http://stackoverflow.com/questions/23683029/is-gccs-option-o...

More samples:

  int a[] = { 1, 2, 3 };
  int* b = a+0; // Perfectly defined/legal/normal
  int* c = a+3; // Perfectly defined/legal/normal, just don't deference it (as it points past the end of the array)
  int* d = a+4; // Undefined behavior.  HAIL SATAN!
  int* e = a-1; // Undefined behavior.  Also apparently potentially caused optimization induced breakage in practice.  HAIL GCC!
Casting pointers is well defined. What is undefined behavior is to dereference a pointer whose type does not match the pointed-to object.
That's not correct in general; casting pointers is possibly undefined. However, it does seem a made a mistake trying to use void * as an example.