Hacker News new | ask | show | jobs
by regehr 3741 days ago
Josh, what's your opinion about this situation?

https://goo.gl/3hz0em

2 comments

It's undefined. But the same thing would be undefined in C++, a language that has inheritance built-in: https://goo.gl/shOJi1

You can't downcast to a derived type if the object isn't actually an instance of the derived type. That seems straightforward, no?

It doesn't seem straightforward to me: you're using words like base and derived that aren't in the C standard.
If we talk in terms of concepts that exist in the C standard, we would say that you can't cast an object to pointer-to-X unless your pointer actually points to an X.

The reason your example is illegal is that you are casting to pointer-to-"struct derived", but the thing being pointed to is not actually a "struct derived."

The "physical subtyping" pattern works because the C standard says that a pointer to a struct, suitably converted, also points to its first member. So a pointer-to-Derived, converted to a pointer-to-Base, points at Derived's first member. But a pointer-to-Base doesn't point at a Derived unless that object actually is a Derived. So the downcast is only legal if the object actually is a Derived.

Looks like your other comment hit the max reply depth so this will need to finish up, but in any case I don't agree with your reading of the vice versa.
It may be that the aliasing rules are also required to fully justify my conclusion (ie. a Base can't have its stored value accessed via a pointer-to-derived due to the aliasing rules). But I have a very high degree of confidence in the conclusion itself. I think that you will find that your compiler implements the behavior I have described.
Replying to myself because depth limit.

Let me try to think of a good way to update the post to capture this better...

There isn't actually a depth limit (or if there is we haven't hit it yet :). HackerNews just hides the "reply" link for 5 minutes or so to cool down flamewars.

You can work around this by clicking on the link for the post itself (ie. "3 minutes ago") which allows you to reply immediately.

I don't see text that justifies your one-way argument, the bit of 6.2.7.1 that we are talking about says "and vice versa".
I'm not making a one-way argument. If the underlying object actually is a Derived, you can freely cast between pointer-to-Base and pointer-to-Derived. That is what "and vice versa" means.

But if the object isn't actually a Derived, you can't cast to pointer-to-Derived:

    Derived derived;

    Derived *pDerived = &derived;

    // This is legal because it's equivalent to:
    //   Base *pb = &derived.base;
    //
    // ie. there actually is a Base object there that the
    // pointer is pointing to.
    Base *pBase = (Base*)pDerived;

    // This is legal because pBase points to the initial member
    // of a Derived.  So, suitably converted, it points at the
    // Derived.
    //
    // The key point is that there actually is a Derived object
    // there that we are pointing at.
    pDerived = (Derived*)pBase;

    Base base;

    // This is illegal, because this base object is not actually
    // a part of a larger Derived object, it's just a Base.
    // So we have a pDerived that doesn't actually point at a
    // Derived object -- this is illegal.
    pDerived = (Derived*)&base;

    // Imagine if the above were actually legal -- this would
    // reference unallocated memory!
    pDerived->some_derived_member = 5;
Are you sure about 'illegal' there? Is any compiler going to complain?

All the compilers I have used will cheerfully reference unallocated memory; I thought the behavior was undefined.

I'm glad I don't have to understand any of this
That is undefined behavior for two reasons. You're interpreting an object as an object that has an incompatible type.

It also violates string aliasing because 6.5.7. only works for one way. See my other comment.

Derived can alias Base, but not vice-versa.