Hacker News new | ask | show | jobs
by krylon 4040 days ago
IIRC, if the argument to sizeof is a type, the parentheses are mandatory, anyway, so using it like it was a function is more consistent.

I think the reason it is not a function from the standard's point of view is that C does not have any builtin functions (unless my memory totally fails me in this case), all functions have to be either defined locally or #included.

4 comments

Exactly.

I heard people arguing on the Internet that you should not add parentheses when sizeof is applied to an expression, only a type. I just cannot understand why they bother. Just add a parenthesis and it is always right. Much less cognitive burden.

This is the same people that find it "better" to write JS without the semicolons.
The problem is that without a linter there's no punishment for missing semicolons.

So really the argument should be that everyone should use a linter.

Agreed. I spend most of my time writing non-semicolon languages like Ruby, Haskell, Scala, Elixir, Clojure, and Python. As a result, I never developed the semi-colon reflex that a lot of Algol-family language users built up. I constantly forget semi-colons even after all these years, and my own inability to remember annoys me.

I used to skip semicolons in JavaScript because I could never remember to include them. Using a linter with Emacs made it a non-issue; I get a warning in my editor immediately when I miss one.

PS: I'm really enjoying Rust, so maybe this will be the language that finally forces me to pick up the semi-colon habit.

I thought about that regarding semicolons, optional braces, newline placement, consistent placing of whitespace in general; then I decided having the machine tell me where I need to do more work is all very well and good, but what I really want is a program that will do the work for me instead. So I wrote one: https://www.npmjs.com/package/jsclean
Oh, there's plenty of punishment. Avoiding the punishment in the difficult part. And you can do that the easy way or the hard way.
Well, sure, but it's probabilistic and delayed punishment. This makes learning habits and ensuring them difficult.
I don't believe its equivalent. When I'm writing Scala code in the backend and have to switch to frontend I often find myself omitting the semi-colons but going back and fixing due to self-imposed coding conventions. For Javascript just as in Python or Scala, semi-colons are optional for a good reason.
We would all be better off if semi colons had just been required in JavaScript. The problem is that semi colons are not actually optional in JavaScript. Instead JavaScript use ASI, automatic semi colon invasion, where the compiler attempts to determine where semi colons should go. Unfortunately the rules are complex, prone to certain errors, and I can't rely on everyone I ever work with understanding all of those rules. For all of these reasons if you are writing JavaScript you should just use semi colons.
>Unfortunately the rules are complex, prone to certain errors, and I can't rely on everyone I ever work with understanding all of those rules.

Are they really complex? This recently released version 4.0.0 of the JavaScript Standard Style [1] suggests to never start a line with "(" or "[". This rule looks even simpler than the rules of operator precedence.

[1] https://github.com/feross/standard

Update: Now I learned about the JavaScript Semi-Standard Style [2] which accually enforces semi-colons. Quite hilarious.

[2] https://github.com/Flet/semistandard

They are complicated enough; and different enough from other languages that they lead to unexpected behaviors. It is just easier to say, "everyone must use semi-colons". Then I can add linting into our build process and reduce risk of bugs.

One quick example is the very contrived example that follows.

return {a:1, b:2}

That is valid JavaScript. It is evaluated as an empty return and an unreachable expression. Probably not what was intended.

I've sent copies of the rules to people and explained them numerous times, but at the end of the day it is more productive to just use them and move on to something else IMHO.

If you have an argument to make, then make it.

The previous poster didn't say he was unwilling to learn JS semicolon rules, he was saying he couldn't trust everyone he ever works with to learn them. Replying with an accusation of FUD and a link to someone ranting that it's unprofessional for JS devs to not know these rules is unnecessarily inflammatory while missing his point.

Well, semicolons may be optional in Python but the official python style strongly recommends agains (needing to use them)

It's not about forgetting them once in a while, it's about playing a game of "needs a semicolon or doesn't" which increases the (already high) number of things the developer needs to worry about

In Python you usually don't add semi-colons, and it will bark when your intention is unclear.

In javascript you may omit semicolons, but the interpreter will guess your intention when it is ambiguous and it will probably guess it wrong.

Do semicolons make the code run faster? What is the advantage exactly? There is no downside to leaving them out.
AFAIK the only "benefit" to semicolons is that you can put multiple statements on the same line.
In C there are no functions that take a type as argument.

So

   sizeof(type)
is a special case anyway.

I'm all for using sizeof like a function, but that doesn't make it consistent. sizeof is just a special syntactical construct.

I like to think that sizeof is called an operator just for syntactic convenience much in the same way as typedef is a storage-class specifier.

And you really shouldn't use it with a type if you can avoid it anyway, it makes code brittle e.g.

    int *foo;
    // code
    foo = malloc(sizeof(int));
a few months later, change foo to be a double. Code still compiles, no warning, but you're allocating half the memory you need.
This is a really great example why you SHOULDN'T think of sizeof as a function. If sizeof were a function, the code

  int *foo = NULL;
  foo = malloc(sizeof(*foo));
would be undefined behavior (dereferencing NULL)!
Dereferencing a null pointer is legal in C. It's the conversion of a null pointer from from an r-value to an l-value that's illegal, which does not happen in that snippet of code.

That's why it's perfectly legal in C to do this (&*foo), even if foo is a null pointer.

> Dereferencing a null pointer is legal in C. It's the conversion of a null pointer from from an r-value to an l-value that's illegal, which does not happen in that snippet of code.

And it does not happen in that snippet of code because sizeof is nothing like a function.

Which is why it's nice to lift stuff out into typedefs. It centralizes them (DRY principle) and avoids this issue.

    typedef int thing_t;
    ...
    thing_t *foo;
    // code
    foo = malloc(sizeof(thing_t));
Why is that better than just getting `sizeof(* foo)`? Even with typedef, I can see this happening in the future:

    typedef int thing_t;
    ...
    thing_t *foo_internal;
    thing_wrapper_t *foo;
    // code
    foo = malloc(sizeof(thing_t));
If you do:

    foo = malloc(sizeof(*foo));
That's at least always on the same line.
That just makes it more verbose.

This, on the other hand, always allocates one object of foo's pointed-to-size, whatever its type:

    foo = malloc(sizeof(*foo));
As an aside, I think nearly any time you want a typedef, it's worth wrapping it in a struct.

    typedef struct { int value; } thing_t;
That way the compiler catches it when you try to pass the wrong thing (at least, more of the time).
DRY is good, but making the structure of your code reflect the actual semantics you want is better. What you want is to allocate space for foo. So write that.
Well, in this (simple) case you can just do

    int *foo;
    foo = malloc(sizeof(*foo));
And avoid the brittleness mentioned.
There are standard macros (va_arg, I'm looking at you) that take a type as an argument.
Another oddity of C that amuses me is the do/while loop without braces:

    int i = 4;
    do
       printf("hey\n");
    while (--i > 0);
Even though do/while is a keyword bracketing pair in C, it still only lets you use a single statement (because nested whiles). So everybody uses braces, and thus it looks quite disturbing without them.
Its not so disturbing when you consider that braces aren't special cased in the C grammar. They group statements so they can be used together where a statement is needed. From the grammar's perspective the "normal" way is without braces. It's just that all the C-alikes have gone a different direction with the way braces are parsed leading everyone to regard the original behavior in C as ugly warts.
for, if, and while without curly braces for single statement blocks are easily one of the worst things about C. It bites you in the ass every time. The balance between its utility and its capacity to cause bugs is so one sided, I don't understand why it's even taught to beginners. If you are teaching a new programmer that saving keystrokes is important, you're on the fast track to creating a shitty programmer.
That's pretty severe. Written with spacing, its really pretty clear what is meant by

   if (condition)
      Foo(x)
Braces are, in my opinion, an unfortunate necessity in some cases. They are a much larger cause of error than NOT using them ever could be.

An ideal IDE would make blocking visible (background tone change etc), and braces could be emitted automatically by the IDE without ever cluttering up the code shown to the programmer.

But tying everyone to a single IDE is never going to happen. The language is out there. It's going to stay backward compatible forever.

How could the presence of braces cause a worse error than no braces? Code compiles when you completely omit braces but put more than one statement underneath. The error is logical, not syntactical.

But if you have an open brace without a matching close brace, that's a compile error. What other error are you referring to?

How can you know that you won't add more statements to the block? Why have a special case at all? For me it's just become muscle memory to add the braces. It's a risk with exactly zero upside to omit braces.

The only problem with this is if/when someone comes along and doesn't notice the missing braces and does

   if (condition)
      Foo(x)
      Bar(x)
Expecting Bar(x) to be part of the conditional. This can and does happen.
I hear folks say that, but don't encounter it in the wild. Maybe once in 20 years so far. That construct, reading it just now, just screams out at me "Indentation error!"

Anyway it nicely illustrates the need to get braces out of there altogether. The programmers' intent is obvious; let the IDE 'make it so' by emitting braces in the generated code.

non-bracket is useful and informative in some cases when used properly (your local style guide takes precedence of course). I would never write

   if (condition)
      Foo(x)
for the reasons you said. But it can be very useful for a block of "single liners":

  /* clean up input before passing it to flaky_external_module() */

  for(; !isspace(*p); ++p); 
  if (!isdigit(*p)) return INPUT_ERR;
  for (char *i = p; *i; i++) *i = toupper(*i);
  ...

  flaky_external_module(p);

Basically a small block (that pretty much fits in your fovea) that does a bunch of minor tasks. Spacing them out would actually confuse the code.
I contend that the brackets always make the code more readable, without exception. It's the normal case. All other variations make you think "something special is happening here, I'm going to have to parse this carefully". The for with a semicolon at the end is easy to overlook. The return in the middle of a function in the middle of a line is easy to overlook.

Spacing code out makes it more readable and maintainable, not less. It brings consistency, and it is more prepared for the inevitable change. I think maintenance is the driver of all code style. When you make tight one liners or forgo braces, or use the ? And : operators instead of if and else, you're not really saving time, you are deferring work, in a lot of cases to another programmer.

> Another oddity of C that amuses me is the do/while loop without braces: [...] > Even though do/while is a keyword bracketing pair in C, it still only lets you use a single statement (because nested whiles).

Ah, but don't forget you can still use the comma operator, so get several statements in before the semicolon:

    int i = 4;
    do
       printf("hey"), printf("Jude.\n");
    while (--i > 0);
it looks quite disturbing without them

Remove the newline and it looks ok to me:

    int i = 42;
    do printf("hey\n");
    while (--i > 0);
Another possibility would be:

    int i = 42;
    do printf("hey\n");
        while (--i > 0);
Oohh... this would be so good for the underhanded C contest! Just mix is up with the comments which make the `do` look like (assuming good actor) an accidentally wrapped comment.

    /* The following code does something, so here's the
    // explanations of what happens. And here's what we actually
    */ do
    printf("hey\n");
    
    /* And now just count down */
    while (some_check(--i));
Now spot that in a large file of real code!
Sneaky. Most syntax highlighting would make it stick out like a sore thumb, but still, sneaky.
Just looking ok is not the point. I would prefer a match between how it looks and what it does...
I agree, but I don't see that undermined here.
> it is not a function from the standard's point of view is that C does not have any builtin functions

The problem with sizeof is that is should be able to accept a type as an argument. No function in C can do that, according to the standard C grammar.

Somewhat similarly, the standard va_arg is a macro, not a function, also because it accepts a type.