Hacker News new | ask | show | jobs
by 0-_-0 2798 days ago
Well for one, most situations I can think of when an index is calculated a 0-based index is more useful.

Addressing an element T[h][w][c] in a linearized 3D tensor with 0-based indexing:

    T[h*W*C+w*C+c]
with 1-based indexing:

    T[(h-1)*W*C+(w-1)*C+c]
Or let's say you want to take a string "abc" and repeat it until the length is 10, getting "abcabcabca". With 0-based indexing:

    a,b = "abc", [" "]*10
    for i in range(0,10):
      b[i]=a[i%3]
In a 1-based language that's:

    for i in range(1,11):
        b[i]=a[(i-1)%3+1]
Here's what Dijkstra has to say about it:

http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EW...

5 comments

Julia was developed by scientists that used math software like MATLAB. Most mathematics software uses 1-based indexing. (Previous comment about this.[1])

>with 1-based indexing: T[(h-1) * W * C+(w-1) * C+c]

Your example showing how cumbersome and ugly it is to subtract 1 -- isn't how the intended audience uses mathematics software. Instead, they would use idiomatic multidimensional access with commas or multi-brackets syntax that understands 1-based indexes. They do not manually multiply and add the offsets into a raw 1-dimensional array. (E.g. idiomatic T[x,y] or T[x][y] instead of contrived T[(x-1) * W+(y-1)])

It's also the same concept for MS Excel. In VBA macro programming code, one doesn't need to litter the code with "minus 1" everywhere. The 1st row and 1st column (cell A1) is addressed as "1,1" in the Visual Basic function "Worksheets!Sheet1.Cells(1, 1)"

Mathematics software has a tradition of 1-based indexing probably because many (most?) equations in written mathematics are 1-based. (Examples [2] [3].) It doesn't seem so unreasonable that scientists prefer that their math programming code has 1-based indexes to match the 1-based indexes in traditional math notation.

[1] https://news.ycombinator.com/item?id=11072438

[2] average : summation symbol ∑ has "i=1..n" not "i=0..n-1": https://en.wikipedia.org/wiki/Average#Arithmetic_mean

[3] linear algebra matrices : i and j subscripts begin at 1,1 and not 0,0: https://en.wikipedia.org/wiki/Matrix_(mathematics)

My argument was that in situations where the difference between 0 and 1 based indexing makes a difference it's ususally 0-based indexing that leads to simpler code.
I guess I find it ironic then, that one of the most common programming errors, is of the "off-by-one" variety[1]. It is in part due to differences in convention and notation in various fields[2].

This isn't common in 1-based indexing languages. Very common in 0-based indexing languages[3].

If the 0-based indexing was objectively superior, wouldn't there be fewer instances of such bugs?

[1] https://en.wikipedia.org/wiki/Off-by-one_error

[2] https://en.wikipedia.org/wiki/Zero-based_numbering

[3] https://softwareengineering.stackexchange.com/questions/4289...

I was hoping to read some code analysis that supported your claim of off-by-one being more common in 0-based languages, but all I see in your [3] is a few different speculative reasons - am I missing something?

Off-by-one can show up in a bunch of ways that are agnostic to the underlying indexing, so it isn't obvious to me that there would be a substantial difference at all. Also, given what's said below about the ease of writing code which fails to consider custom indices, I'm not sure I believe just switching to 0-based or anything else in Julia is as painless as you suggested earlier - it could introduce its own off-by-one errors!

Agreed. Off-by-one often shows up when calculating indexes e.g. differences in indexes or offsets. These calculations are independent of the base of the index (0 or 1 or even N).
This implies that off by one is more common in zero based counting. Evidence? Your link didn't seem tob support that changing convention of indexing mattered. Rather, no longer indexing at all is what helped. (E.g., iterators.)

This fits my understanding. I've seen people do off by one just in the wild when counting how many posts/nails/etc are needed.

> simpler code

Have you used julia? You're almost never calculating your offset to a pointer. The VM does that for you. (As it does for python, ruby, erlang). Or actually in the case of julia it's more accurate to say the (extremely lazy ahead of time) compiler does it for you.

In Julia, you don't have to calculate linear indices yourself since you can create a reshaped view of a blob of memory in arbitrary number of dimensions. So this case doesn't come up. Also there are plenty of cases where 0-indexing is more awkward than 1-indexing. Actually reading your Dijkstra link will make that pretty clear. Reading it will also make it clear that the 0-indexing choice is pretty arbitrary and that you can make a very similar case for 1-indexing too. This 0-indexing meme really needs to die, especially the argument from authority by linking to Dijkstra.
Contrary to your summary, the quote makes a good case for 0-based indexing, both theoretical and practical. How can an argument based on reason become an argument from authority just because someone well known said it?
That is in fact how an argument becomes an argument from authority. Also, it has become an HN meme to post the Dijkstra comment every single time there's a post mentioning Julia. It's not like it's news to anyone.
Also Dijkstra's arguments are pretty weak and unconvincing. If anyone else had made the same arguments, no one would ever refer to it.
> Addressing an element T[h][w][c] in a linearized 3D tensor with 0-based indexing:

But why would you ever use linearized addressing when you have proper multidimensional access? Aside from doing the low-level implementation of multidimensional access, you shouldn't ever need to do that.

> Or let's say you want to take a string "abc" and repeat it until the length is 10, getting "abcabcabca". With 0-based indexing

Your example assumes not only the stated indexing model but also a range function that returns the integer members of a half-open range, which is a little crazy of a way to do loops, but the least unintuitive way with 0-based indexing, so programmers are used to it from C (where you do it without an actual function call) and newer zero-based languages.

With 1-based languages, the sane way is the more intuitive closed range (for i in 1:5) which doesbt require an extra mental operation to convert to the set of values that will actually be used. There are good uses for half-open ranges, but the kind of explicit loop control they get used for in 0-based languages is an artifact of 0-based languages, so arguing that it is (even more) awkward in a 1-based language isn't an argument against 1-based languages.

One-based indexing obviously means that ranges are inclusive on the right, and you iterate by `for i = 1:length(x)`.

That being said, I also abhor 1-based indexing in julia.

Well, you get used to it, and it is imho a small price to pay for julia's other cool things.

In the end, julia needs to access arrays through pointers, and indirect adressing / LEA use zero-based indexing. Somewhat amazingly, llvm manages to remove/hoist most of the extraneous subtractions. But still, it would be much more transparent if julia simply reflected the hardware standard of zero-based addressing (when in doubt, stay as close to the silicon as reasonably possible).

Offset arrays are a bad choice when avoidable, because (1) they incur an additional pointer load, (2) most julia code uses 1-based indices (conforming to common language idioms is always good), and (3) you will trigger all sorts of bugs in all sorts of packages that falsely assume that all AbstractArray are one-based.

In fact, almost every solid julia package uses arbitrary indexing for input arrays. `for i = 1:length(x)` is discouraged in production code, instead the ideom is `for i in eachindex(x)`.
Some algorithms work better with 0-based indexing. But maybe, just maybe, there are other aspects to look at when evaluating a programming language than "it doesn't by default follow my favourite indexing type". Especially given that 0-based indexing is very easy to use for any array you may wish to use it for.
Any mention of Julia immediately degenerates into bikeshedding about indexing.