Hacker News new | ask | show | jobs
by astrange 1713 days ago
> If you want to blame someone, blame the designers of the C language for doing things like making int the natural idiom to iterate over arrays even when size_t would be better. The fact that C programmers continue to write "for (int i = 0; i < n; i++)" to iterate over an array is why signed overflow is undefined, and it is absolutely a critical optimization in practice.

Well, size_t is unsigned and has defined overflow, so you'd lose the optimization if you switched to it. (Specifically, there's cases where defining overflow means a loop is possibly infinite, which blocks all kinds of optimizations.)

Many languages try to fix this by defaulting to wrap on overflow, but that was a mistake because you rarely actually want that. A better solution is to have a loop iteration statement that doesn't have an explicit "int i" or "i++" written out.

2 comments

The optimization I'm referring to is widening a 32-bit loop IV to 64-bit so it can stay in a register: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

size_t obviates the need for this optimization.

That's a strange example since it doesn't prove their point.

"int count; … for (int i = 0; i < count; ++i) {}" can't overflow so the optimization is always valid.

"int count; … for (int i = 0; i < count; i += 2) {}" is more of a problem.

I don't think "have a 64-bit int type" is the right approach for a new language either… we should be aiming for safety. If a variable's valid values are 0-50 then its type should be "integer between 0 and 50", not "integer between 0 and UINT64_MAX". Storage size should be an implementation detail.

ADA has this exact functionality: https://www.adaic.org/resources/add_content/standards/05aarm...

Look for the "range" keyword

There was no size_t in K&R1. Size_t was introduced in the standards process as was the definition of index variables in the for loop. You may have a complain with the standard there.

As for the optimization, it is based on misunderstanding of C semantics. The only place where the sign extend makes a difference is where pointers are longer than "ints" AND where the the iterator can overflow, and in that case, sign extend only makes a difference if the loop is incorrectly coded so that the end condition is never true. The code should just provoke a warning and then omit the sign extend (and it almost certainly doesn't make much of a difference since sign extend is highly optimized and has zero cost in a pipelined processor).

> There was no size_t in K&R1. Size_t was introduced in the standards process as was the definition of index variables in the for loop. You may have a complain with the standard there.

I certainly do. C and descendants makes you over-specify variables by declaring them all int/size_t/etc individually. It should've had C++11-style "auto" from the start and there should be a statement like "for i=0..n" that declares "i" the same type as "n".