Hacker News new | ask | show | jobs
by somenits 1086 days ago
Good article, a couple nits/additional notes:

1. The article points out you should compare structs field by field, but it doesn't explain why memcmp wouldn't work. The reason is that the padding between the fields might not necessarily be zeroed in all cases. Field by field comparison is resilient to this.

2. The article proposes this for dynamic allocation:

    struct Vector2D *vec = malloc(sizeof(struct Vector2D));
I think it's better to use the variable name inside sizeof, so like this:

    struct Vector2D *vec = malloc(sizeof(*vec));
This helps you in the case where you change the type of the variable to different kind of struct. If you change the variable name, you're probably doing a find/replace anyway, and it will almost certainly fail to compile even if you miss it.
3 comments

But if you change the type name, you are probably also doing a search and replace.

In the end, the only justification for the latter rather than the former is that the pointer name is often a short name like vec, and so sizeof *vec is shorter than the alternative.

That has advantages.

If there is a typo in a short name, you're more likely to see it.

I can easily miss the typo:

  struct Vector3D *vec = malloc(sizeof(struct Vector2D));
The sizeof <object-expr> also doesn't require parentheses, so it's shorter by two characters right off the bat.
You'll get a compiler error though unless you, in the same edit-compile cycle, created a new variable with the same name.

You also can't search-replace a type name as easily, because other variables will have the same original type but not need to be updated.

This is a fair point. If we have *foo = malloc(sizeof *foo), we can search and replace for all occurrences of foo in that scope and replace them; that's always something which potentially makes sense.

However with *foo = malloc(sizeof (type)), it doesn't always make sense to be searching and replacing all occurrences of type, even if restricted to that scope. And, when type is a built-in type like char *, we cannot* be replacing all occurrences of it, even in that same scope where foo is active.

Types are embroiled in declaring multiple entities in the program, whereas a variable name declares exactly one thing; all other occurrences of the variable are references to that thing. Sometimes it makes sense to replace some of those occurences but not others, but that's neither here nor there: in *foo = malloc(sizeof *foo) you would never replace one foo without the other, even if some references to foo elsewhere in scope remain unedited for good reasons.

Basically you'd have to do something extremely absent-minded or silly to wreck *foo = malloc(sizeof *foo), whereas accidentally wrecking the correctness of *foo = malloc(sizeof (type)) in some way can happen under only a small lapse of mindfulness. Since the two sides refer to different identifiers, it makes sense to edit them separately: if you're renaming foo, you don't touch (type); if you're changing (type) then you don't touch foo (but have to update its declaration elsewhere). There is no obvious, trivial, easy-to-maintain consistency maintain that is localized just in that assignment; things go wrong because of material in separate places elsewhere.

Then again, if we are initializing something being declared, we have:

   T *p = malloc(sizeof (T));
or

   T *p = malloc(sizeof *p)
In the former, there is internal consistency with the repetition of T.

In the latter, repetition of p.

In the former, you're not going to change one T without the other, just like in the latter you won't change one p without the other.

Assignments give advantage to the latter:

   p = malloc(sizeof (T));  // no local redundancy linking parts together.
I’ve been wondering about (1) recently—is there a way to memset the entire stack frame at the start of a function such that memcmp works as expected? Also, what are the performance implications of comparing a padded struct member-by-member vs a single big memcmp? Is member-by-member faster because you’re comparing less in total, or is memcmp faster because it’s one big contiguous compare? Or is it more complicated?

Regarding (2), sizeof(*x) doesn’t actually dereference x, right? The dereference isn’t evaluated—it’s all calculated at compile time, right?

> is there a way to memset the entire stack frame at the start of a function such that memcmp works as expected?

I'd argue no simply because the value of the padding bytes is always unspecified. A compiler that sees such a `memset` is (IMO) perfectly free to not zero known padding bytes since it knows their value should not matter to the program. Compilers might not currently do that but you can already see this kind of behavior in other situations - C compilers will happily throw out `memset` calls if it knows the result won't be used.

But also beyond that, it probably doesn't matter anyway because there's no way to _use_ the `struct` which won't leave the padding bytes with unspecified values. `memset` might reliably zero the padding bytes for you, but writing to the struct will randomly screw up the padding bytes depending on what the compiler feels is the best way to do things, so then you're back at `memcmp` no longer working. The only real way to make `memcmp` work is to ensure you have no padding bytes to begin with.

Compiles are free to not zero padding bytes if a struct is passed to memset but there are some situations where padding is not unspecified, for example if you do partial initialization of it.
Could you explain what you mean by partial initialization?

  = { .foo = bar };
(1) If the pointers you pass to memcmp are know to the compiler, then memcmp will be treated as an intrinsic, and generally the compiler generates the optimal set of instructions for a given struct.

(2) Correct. The expression is only evaluated at compile time.

If you really wanted to memcmp a padded struct, you could declare it as packed and define the padding yourself.
You can have memset/memcmp work for individual structs by setting the padding yourself, and to make sure you do it right, on GCC/clang you can use -Wpadded. It's probably best to do that on a case-by-case basis, unless you're prepared to deal with a lot of compiler warnings/errors though.

    #pragma GCC diagnostic push
    #pragma GCC diagnostic error "-Wpadded"
    struct foo {
      uint8_t a;
      uint8_t _pad[3];
      uint32_t b;
    };
    #pragma GCC diagnostic pop