Hacker News new | ask | show | jobs
by blt 3775 days ago
Here are some tasks that are ugly with [1:n] indexing:

    - the 1D index of element (i,j) in a matrix is i+(j-1)*m instead of i+j*m

    - the i'th 3-element subvector of a vector is v[3*(i-1)+1:3*i] 
      instead of v[3*i:3*(i+1)]

    - if you have vector of indices that partitions an vector into chunks,
      the i'th chunk is v[ind[i]:ind[i+1]-1] instead of v[ind[i]:ind[i+1]]
Perhaps small issues, but these are all real examples from my most recent Matlab project that were annoying.

But maybe, like the static typing issue, my opinion on this topic is distorted because I spent a lot of time programming in C++ and comparatively little time reading math papers.

Or maybe it would be equally easy to make a list of tasks that are ugly with [0:n) indexing.

2 comments

>I really, really wish they had dropped the 1-based indexing

>, my opinion on this topic is distorted because I spent a lot of time programming in C++

Mathematics-related programming[1] in MATLAB, R Language, Mathematica, SAS, etc all use 1-based indexing. Given that the originators of Julia are MATLAB users, it makes sense that they made a deliberate choice to keep 1-based indexing.

In other words, it was more important to grab mindshare from those previous math tools rather than appeal to C/C++/Java/etc programmers.

One outlier in the landscape of numerical programming is Python+NumPy/SciPy in the sense that it uses 0-based indices. While Julia also wants to be attractive to Python programmers, it still seems like the bigger motivation was programmers of MATLAB and other math software.

[1]https://www.youtube.com/watch?v=02U9AJMEWx0&feature=youtu.be...

This, pretty much. Not to mention that, beyond languages, data is often 1-based indexed. I have never gotten a patient data set with ID=0 as the first entry. In my mind, compatibility with what users are expecting, and trying not to induce indexing errors, trumps most other concerns.
>, beyond languages, data is often 1-based indexed.

That's a good point. Probably the most widespread data example for non-programmers is spreadsheets (MS Excel, Google Sheets). The first row[1] in the spreadsheet is labeled as "1" instead of "0". The idiomatic Visual Basic programming code to loop through the rows would look something like:

  For Each cell In Range("a1:a25")  ' not "a0:a24"
      ' do work
  Next cell
[1]https://www.google.com/search?q=microsoft+excel+spreadsheet+...
Mathematica is actually 0-based - but with the zero index spot reserved. A list {1, 2, 3}=List[1, 2, 3] could be read as (List 1 2 3) in Lisp style; one can check if you have Mathematica that {1, 2, 3}[[0]] = List.

But I think that's neither here nor there. Whenever the index has more use than as a label, mathematics starts at zero. Modular arithmetic, polynomials, discrete fourier transformations - for that matter, any discrete approximation of continuous math - all naturally start at zero, and generate lots of -1s in one-based indexing.

In this subthread, the phrase "0-based / 1-based" is for the word "base" accessing the first element which is semantically equivalent to x[0] in C/C++/Python/etc or x[1] in R Language. Yes in Mathematica, putting "0" between "[" "]" will get you the reserved "head" but that's not how people are talking about "0-based" here.
You will find problematic examples no matter which option you choose (which is why I personally favor being able to set the lower bound freely).

With half-open ranges, for example, you will need different code to address a segment and the last element of a segment. E.g. if you have some structure with start_of(i) and end_of(i) expressions, then you can do a[start_of(i):end_of(i)] with closed indexing and a[end_of(i)] to access the last element, while with open intervals, you have to break the abstraction and use a[end_of(i)-1].

You can also iterate over start_of(i) .. end_of(i) in a for loop naturally if ranges are closed. (See how Python's iteration is defined in terms of half-open ranges and how iterating over closed ranges – which happens often enough when the values aren't indices – is a bit of a pain in Python.)