Hacker News new | ask | show | jobs
by kzrdude 1137 days ago
you'll have to understand the 'x' syntax and the "xyz" syntax as two different things. Different quotes.
1 comments

I know. But my understanding was that `"xyz"` is an array of characters so that these two would have the same representation in memory:

  char word[] = {'x', 'y', 'z', '\0'};  // sizeof(word) = 4, sizeof(*word) = 1
  char word[] = "xyz";                  // sizeof(word) = 4, sizeof(*word) = 1
What I did not realize was that the above two are not the same as this:

  char *word = "xyz";  // sizeof(word) = 8, sizeof(*word) = 1
The representation of an object is determined by how the object itself is defined.

An initializer doesn't change that. It only affects the value stored in the object when it's created.

A special case exception is that an array object defined with empty square brackets gets its length from the initializer, so

    char word[] = "xyz";
is a shorthand for, and is exactly equivalent to:

    char word[4] = "xyz";
What I see there is that you seem to highlight the difference between using sizeof with an array and sizeof with a pointer, which makes a difference, even if array-decays-to-pointer is a rule in most other contexts.
Right, I am mixing up two things here. You are right that bringing up pointers here is a mistake.

But apart from that, I would expect `{'x', 'y', 'z', '\0'}` to have size 16 rather than size 4 because it consists of four character literals which each have size 4 on my machine.

Maybe do not overthink it. 'x' is called a character literal, but it has the type int.

`{'x', 'y', 'z', '\0'}` does not have a type by itself, but it's valid syntax to use it to initialize various structs and arrays - some of those will have the size you are looking for, depending on which type of array or struct you choose to initialize with that: https://gcc.godbolt.org/z/Tqjq3xzKo

Thank you for the explanation and the Godbolt example! I appreciate it. Apologies for fumbling around in confusion.
sizeof() returns the number of "units" that something -- an expression or a type -- takes up. What do you think those units are?

They are literally defined as "characters". sizeof(char) is always 1.

Your confusion (besides the pointer thing) is that 'x' is a funny way to write an int, not a char.

It seems to me that `sizeof` returns the number of bytes that the thing takes up in memory. For example:

  int numbers[] = {1, 2, 3};  // sizeof(numbers) = 12
> Your confusion (besides the pointer thing) is that 'x' is a funny way to write an int, not a char.

Yes, this might be it. So the way to get a `char` value that contains "c" is to use type coercion and write it as `(char) 'c'`. This changes the representation in memory so that it now takes up only one byte rather than four, right?

`(char)'c'` is an expression of type char.

Its size is one byte -- but the size of an expression isn't really relevant, since it's (conceptually) not stored in memory.

You can assign the value of an expression to an object, and that object's size depends on its declared type, not on the value assigned to it. The cast is very probably not necessary.

    char c1 = 'c'; // The object is one byte; 'c' is converted from int to char
    int  c2 = 'c'; // The object is typically 4 bytes (sizeof (int))
The fact that character constants are of type int is admittedly confusing -- but given the number of contexts in which implicit conversions are applied, it rarely matters. If you assign the value 'c' to an object of type char, there is conceptually an implicit conversion from int to char, but the generated code is likely to just use a 1-byte move operation.