Hacker News new | ask | show | jobs
by commandlinefan 2119 days ago
> Relying on the difference between the pre and post variations of these operators is a classic area of C programmer ego showmanship.

Ugh. He's talking about inline use of post-increment and pre-increment (i.e. x++ and ++x) here. This is perfectly readable to a C programmer, and sidestepping them actually makes the code harder to understand.

1 comments

Can you give an example where not using them 'inline' makes code harder to understand?
Consider the idiomatic way of interating backward through a array:

  for(i=n; i-- > 0 ;)
    { /* operate on a[i] */ }
converting i--; to a statement at the start of block makes it less clear that it's part of the iteration idiom rather than a ad hoc adjustment that's specific to this particular logic. There are other examples, but they're either more involved or statementification is less obviously wrong.
Hmm, I think `for (i = n - 1; i >= 0; --i)` is way clearer and maybe more common?

edit: Ah unsigned underflow. :O

Yeah, so then you write

    for (size_t i = n-1; i < n; --i) { /* operate on a[i] */ }
It works fine (unsigned overflow is well defined) but it's even less clear.
It seems sensible to always just use signed values for indices. Indices are difference types, which should include negative values so that you can subtract two indices and get a sane delta. The range of signed values seems 'big enough.'
> Indices are difference types

Umm, no? Indices are ordinals[0], forming the canonical/nominal well-ordering of a collection such as a array.

> an ordinal number, or ordinal, is one generalization of the concept of a natural number that is used to describe a way to arrange a (possibly infinite) collection of objects in order, one after another. [...] Ordinal numbers are thus the "labels" needed to arrange collections of objects in order.

0: https://en.wikipedia.org/wiki/Ordinal_number

The C language "de facto" uses size_t for indexing and ptrdiff_t for differences, or the rare case where you have a negative index.
size_t is unsigned? Since when?
It always has been. C89, 4.1.5[1]:

> The type are [...] size_t which is the unsigned integral type of the result of the sizeof operator

(Emphasis mine.)

1. https://port70.net/~nsz/c/c89/c89-draft.html#4.1.5

Couldn't find an online version of the C standard with links to parts of it, but here's one for C++: http://eel.is/c++draft/support.types#layout-3

> The type size_­t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object ([expr.sizeof]).

Yes. ssize_t is signed.
That's the idiomatic way? Cool. The more straightforward-looking way,

    for(i = n-1; i >= 0; i--)
        { /* operate on a[i] */ }
breaks if i is unsigned, like a size_t.
Yep. That why it's a idiom, rather than a obvious-way-of-doing-it-that-anyone-competent-would-use.
You can't beat

    return x++;
:)
Why? This should be straightforward.
Imagine operating on something like a stack.

  x = *stack--; // pop 'x' off of the stack
  *++stack = y; // push 'y' onto the stack
This way is simple, direct, and it avoids inconsistent state.
I don't see how this proves the point.

For someone who doesn't have the operator precedence rules memorized, it isn't clear whether the above code means this:

    x = *stack;
    stack--;
or this:

    stack--;
    x = *stack;
Combining those two operations into one line is a trade-off I will never agree with. And I'm a fan of C myself: https://gist.github.com/cellularmitosis/3327379b151445c602ad... https://gist.github.com/cellularmitosis/d8d4034c82b0ef817913...

The two-liner is actually the one which is simpler and more direct, as it requires less knowledge of operator precedence rules. The one-liner and two-liner compile to the same number of instructions, so I don't see how either "avoids inconsistent state".

Many expert-level C programmers tend towards one-liners. Here's an example from the original "Red book":

    c = ((((i&0x8)==0)^((j&0x8))==))*255;
nooooo don't do it sadpanda.jpg
> The one-liner and two-liner compile to the same number of instructions, so I don't see how either "avoids inconsistent state".

It's about performance, or thread safety, or anything like that; it's about having a coherent mental model of the code. A statement should, if possible, represent a single, complete operation. Invariants should not be violated by a statement, with respect to its environment. (This more true for 'push' than 'pop'.) One way of solving that is to bundle the 'push' and 'pop' operations up into functions; someone else in this thread did that. But why bother with the mental overhead of a function call when you could just represent the operation directly? To be sure, there are cases where the abstraction is warranted, but a two~three-line stack operation isn't abstraction, it's just indirection.

> For someone who doesn't have the operator precedence rules memorized, it isn't clear whether the above code means [snipped] or [snipped]

> The two-liner [...] requires less knowledge of operator precedence rules

It's not operator precedence—that's a separate issue; despite having implemented c operator precedence, I don't know all of them by heart—but simply behaviour of pre- and post-increment/decrement operations. It's even mnemonic—when the increment symbol goes before the thing being incremented, the increment happens first; else after—but even if not, it's a fairly basic language feature.

Even beyond that, though, it's an idiom. Code is not written in a vacuum. Patterns of pre- and post-increment fall into common use over time and become part of an established lexicon which is not specified anywhere. Natural language works the same way. Nothing wrong with that.

> It's not operator precedence—that's a separate issue

> It's even mnemonic—when the increment symbol goes before the thing being incremented, the increment happens first; else after—but even if not, it's a fairly basic language feature.

I think you missed the issue.

This is 100% about operator precedence, and has nothing to do with the decrement operator being in front of or behind the variable.

This expression:

    *stack--
means either this:

    (*stack)--
or this:

    *(stack--)
depending on the operator precedence rules.

If this is the layout of memory:

             ~~~~~~
    stack-1: | 52 |
    stack:   | 23 |
    stack+1: | 19 |
             ~~~~~~
(* stack)-- evaluates to 22, while *(stack--) evaluates to 52.

https://godbolt.org/z/P7Ghfc

> operator precedence

Right, yes. I got confused by your example, because the example is definitely about pre- vs post-increment. My point about idioms still stands, though.

> (* stack)-- evaluates to 22, while * (stack--) evaluates to 52.

Actually, (* stack)-- evaluates to 23, but changes *stack to 22 :)

Saving characters on spacing is a terrible thing to do. In fact that jumble is missing a zero on the equality, which is made less evident because all the the characters are not spaced in a way that makes this mistake obvious.

    int pop_int () 
    {
      int x = *stack; 
      --stack;
      return x;
    }

    void push_int(int x)
    {
      ++stack;
      *stack = x;
    }

Genunine questions:

- Is this worse? - How does the state get inconsistent?

For one it’s three and two lines for what is two logical operations. I assume the “inconsistent state” is the time between the lines where the stack is not truly in the right state-many people prefer to preserve their invariants as much as possible.
it will produce indistinguishable assembly language, no?
The use of that construct is mainly a stylistic choice. On any compiler from this millennium there should be no difference in the code that it produces.