Hacker News new | ask | show | jobs
by bluetomcat 1392 days ago
A disadvantage comes to mind. While the following loop works as expected:

    for (size_t i = 0; i < length; i++) ...
The following causes an unsigned integer underflow and is an infinite loop:

    for (size_t i = length - 1; i >= 0; i--) ...
8 comments

As Jens Gustedt points out[1], the following intentional unsigned overflow works perfectly for downwards iteration (even when length is 0 or SIZE_MAX), though it looks a bit confusing at first:

  for (size_t i = length - 1; i < length; i--) ...
You are also free to start at any other (not necessarily in-bounds) index, just like with ascending iteration.

[1] https://gustedt.wordpress.com/2013/07/15/a-praise-of-size_t-...

Principle of least surprise violated.

Also, that behavior is not guaranteed. The programmer would need to be aware of how the particular machine in question actually handles that.

Then again, that's C.

Unsigned integer underflow and overflow are both guaranteed to wrap by the C standard.
Uh I'm confused but don't know c++.

why doesn't that loop end instantly?

I mean length - 1 < length should always be true, right?

Or does it only terminate when the number underflows? Terribly confused here

For loops are translatable from:

  for(initialize; condition; increment) { ... }
to:

  initialize;
  while(condition) {
    ...
    increment
  }
(more or less, some scoping things not encompassed by the above; this is also how pretty much every for loop in a C-syntax language works) The condition of a for loop is equivalent to a while loop's condition. So yes, length - 1 < length will be true on the first iteration, which is fine because the loop continues as long as that condition is true.

What the above approach takes advantage of is that when underflow eventually happens you'll have this condition:

  MAXINT < length
Which will terminate it for all possible values of length.
It‘s an unsigned int, so past 0 it overflows back to the maximum
Ooh, i see. I wasn't aware that they under/overflow at 0 when they're unsigned. Thanks for broadening my horizon!
It loops while the condition is true. When an underflow happens, it stops being true.
It’s a condition to run, not a condition to stop
That was my reaction - why should anyone think it might be the latter? Are there languages that do have such a syntax without explicit keywords ("do...until")?
I think lisp or scheme does. I was often confused by that when I was playing with it
The loop continues until i transitions from 0 to 0 minus 1. 0-1 in this case actually doesn't equal -1 since size_t is an unsigned type, instead it wraps around to be the largest possible positive integer instead. TLDR; yes as you speculate it terminates when the number underflows.
the footnote [1] should be [0], just for the sake of this very topic.

Seriously though, while the idiom does work for unsigned integers, it's a bad idiom to learn [makes code reviews harder]. The post-decrement one in the loop body works with everything (signed/unsigned), and it's well known.

In that specific case I'd do the following:

    for (size_t i = n; i-- > 0 ;) ...
Or count from `length` to 1, but subtract 1 in the loop body, or count up and subtract the length in the loop body. Any modern compiler should be able to optimise these to be equivalent.

In the majority of cases, counting down is not necessarily. Nor is ordered iteration. Most languages have a `for each` style syntax that's preferable anyway.

Ah yes, the goes-to operator -->
And the wink operator, so the compiler knows you know the deal
I like how GP is called "operator-name" and instead of doing himself, he makes others joking with operator names. Although I'm not sure if it's altruism or highly manipulative behaviour.
Gold - have an upvote!
Or alternatively:

  size_t i = length;
  while (i--) ...
Unless I am remembering C wrong, this would work but the

   for (size_t i = length; i-- > 0 ; ) ...
that several other people posted would not execute for index 0. Shouldn't it be this instead?

   for (size_t i = length; --i > 0 ; ) ...
the correct way/idiom to reverse iterate an array is

  for (size_t i = length; i-- > 0; )...
It's surprising how often the issue pops, it works well with both signed and unsigned integers.

(edit) I've started with one based indexing (basic)... mixed with 0 based (assembly), more 1 based (pascal), then more stuff (all zero based). I am, yet, to see a real advantage of a one based indexing... after the initial process.

Is this cache friendly?
Sure, why wouldn't it be? As far as a cache is concerned, I don't think reverse sequential iteration would be any different than forward sequential. The actual RAM accesses may be less optimal if there's some speculative pre-fetching with assumed forward sequential access, but that's conjecture.
With some exceptions, hardware prefetch works in terms of ascending accesses. To learn if a particular CPU will prefetch for descending access, benchmarking is essential. Best to use soft prefetch calls if performance is critical.
i would suspect that the cache prefetch/prediction could use the "velocity" of the memory access to predict the next access; so if the access pattern was going backwards, the "velocity" would be negative, but prefetching would still work if they just followed the predicted pattern.
It is not, unless your compiler is smart enough to recognize reverse iteration and prefetch appropriately.

If performance matters, you should experiment with __builtin_prefetch, which is available in clang and GCC.

It’s not. It was nice on architectures were cache didn’t matter much and were subtracting and comparing to zero was just one instruction (looking at you old core ARM)
In the C programming language unsigned integers do not overflow. They wrap. This is well-defined behaviour and the example code is simply incorrect. Most modern compilers will give you a diagnostic for this.
If you have foolishly turned off -W in your build system, that could happen. Otherwise, you get a nice warning pointing out your folly.
Unless wrapping underflow is sensible for the domain (which it isn’t when representing the size of something), unsigned integers are usually a bad idea.
You can always rip a page out of C++’s playbook:

    for (size_t i = length; i > 0; i--) {
        // ...
        item = array[i - 1];
(This is how reverse iterators work in C++.)

  for (size_t i = 0; i < length; i++) {
      size_t j = (length - 1) - i;
      ...
  }
EDIT: change i to j
That's quite broken (you'd want a different variable inside the body, vs clobbering the iteration counter, else this would process the last item in your list, then exit).
I hope you haven't done this anywhere.

Makes my brain hurt, but I think this will only run through the loop one time looking at the last element of the array.

Uh no don't reassign the loop variable in the inner scope. Use:

    const j = (length - 1) - i;
in that case. Much safer.
You should save the old value of i somewhere and restore it back at the very end of the loop. Or simply define a new j like another comment says.