Hacker News new | ask | show | jobs
by __vim__ 1851 days ago
No fortran guy?
1 comments

I think the goal in bringing in a C++ programmer was to provide an outsider view, but no, no Fortran guy. I don't think Fortran is an array language in that arrays are not the only datastructures in Fortran, though?
Fortran isn't an array language at all, really. Maybe some features of the array languages style are super-imposed atop it with parallizing extensions, dialects, or whatnot; but it's style has always been and still is to write explicit loops and mutate state left and right. John Backus in his famous 1977 turing award speech explicitly named it a representative of the "fat weak" languages he talked of and said it was a thin veneer over assembly, he based his fictional FP language on APL instead.

Maybe grandparent meant it has the same _application_ as array languages, in that its only surviving kingdom is scientific computing where devouring gargantuan arrays of numerics is the only thing that matters, unlike C or C++ that are much more widely used. Maybe that's why it has a long history of being parallized with various tools and runtimes _despite_ its inherent imperativeness. (I don't remember where I heard this, but somebody wrote an analyzer to analyze some Fortran programmes in the 90s and found that over 80%/90% of Fortran programmes are spent doing variations of map, filter and reduce. So it's an extremely imperative language against its intended use. Maybe I will post a link in an edit when I find the source of the claim)

Fortran has had built-in array operations since the Fortran 90 standard. If X and Y are scalars or arrays of the same shape, you can X+Y, X*Y, exp(X), sin(X) etc. You can define your own elemental functions that act on both scalars and arrays of any rank. I still write loops when programming in Fortran 95 but less often than in Fortran 77. So I think modern Fortran is an array language.
Is an array language a language where you only have arrays? That honestly sounds a bit odd.
No, an array language is one in which all the built-in operations are applicable to arrays. Take addition as an example:

        4 + 4
    8
In array languages the same operator can be used for arrays; or equivalently you can say that the example above sums two arrays of length 1. In J, you can do this and expect it to work:

       4 4 4 + 2 2 2
    6 6 6
This is true for all the built-ins, and many user defined operations (as long as you don't fiddle with so called rank of the verb you're defining).

Numpy is close to that, but there's still a distinction between arrays and scalars, while in array langauges that distinction is often blurred:

       4 + 1 2 3
    5 6 7
Edit: in J, you can have atoms, or scalars, but you need to box them:

       (3;4;5)
    ┌─┬─┬─┐
    │3│4│5│
    └─┴─┴─┘
but then you can't do anything with them until you unbox them again:

       (3;4;5) + 3
    |domain error
    |   (3;4;5)    +3

       (+&4) each (3;4;5)
    ┌─┬─┬─┐
    │7│8│9│
    └─┴─┴─┘
(Examples straight from J REPL)
It seems almost as if it'd be more useful to not explicitly expose operators as applicable to arrays, but implement SIMD optimization for operator expressions in .map(function) and something like .binaryMap(right, function) ex.

    [1, 2, 3].binaryMap([10, 10, 10], (a, b) => a * b) // Produces [10, 20, 30]
This would be easier to optimize when compiled because the expressions can be simplified and mapped to the right SIMD instructions.
Unfortunately, I have no idea about the implementation and what optimizations are done in J or APL for which operations :( I know that there are hardcoded "fast path" expressions (particular combinations of operations) which have much better performance than more general expressions doing the same thing, so it might be that the optimization happens at that level.

OTOH, your example is very verbose when compared to J's version:

       1 2 3 * 10
    10 20 30
Plus, in J it generalizes to higher dimensional arrays:

       i. 3 3
    0 1 2
    3 4 5
    6 7 8
       (1 + i. 3 3) * 10
    10 20 30
    40 50 60
    70 80 90
My example is verbose in order to clearly communicate the principle. I can trivially shorten it to

    [1, 2, 3].map(a => a * 10)
    
if I wanted to literally carry out that task alone.

    [[0, 1, 2],
     [3, 4, 5],
     [6, 7, 8]].flatMap(a => a * 10)
even less verbose:

(>: i. 3 3) * 10

That just removes a tiny tiny bit of dynamic dispatch overhead. Which is still needed anyways, as array languages can often dynamically switch between 1-bit, 8-bit, 16-bit, 32-bit integer (and 64-bit float) arrays, depending on the elements, completely transparently to the user.
Most compilers can inline statically defined closures in these contexts. And tracing JITs do this even when the closure is not define statically (but is stable).

It's more about allowing the SIMD goodness without the ambiguity and restrictions of "scalar operators work on arrays" implemented naively.

That's the case in APL and J. K uses nested lists to represent arrays, and has non-lists (atoms). But the convention is that an n-times nested list is considered an n-dimensional array so even an atom is an array, with 0 dimensions.

There's a page on the various approaches to arrays in the APL family at https://aplwiki.com/wiki/Array_model .

Mostly yes.

Numbers (and character) are implemented as arrays with 0 dimensions. Text would be an array of characters with 1 dimension (the number of characters), and generally speaking the dimension of an array is a one dimensional list of non-negative integers. Many array languages also include an array type which is approximately the same as a C pointer to an array, with a bit of jargon thrown in to distinguish a reference to an array from the array itself.

Something like an SQL table in an array language would be implemented as a list of columns (rather than as a list of rows) and a corresponding list of column labels. This has some interesting benefits.

That said, functions in array language are typically not arrays (though they presumably would have a textual representation). So... not everything is an array.

From the J docs: https://www.jsoftware.com/help/dictionary/dx005.htm

  nub=: (i.@# = i.~) # ]
     5!:2 <'nub'
  +-------------------+-+-+
  |+--------+-+------+|#|]|
  ||+--+-+-+|=|+--+-+|| | |
  |||i.|@|#|| ||i.|~||| | |
  ||+--+-+-+| |+--+-+|| | |
  |+--------+-+------+| | |
  +-------------------+-+-+
... so J functions have an array representation, at least.
Yes, each J function has several array representations.
You could consider Pandas / numpy an array language, even though it's really a library for Python. The exposed API is IMHO what's important.
I agree, but you have to take extensibility into account. You either have an extensible language and extend it[0], you make it an array language to the core, or you just have a few slightly convenient functions (not a language). (I don't have enough experience to judge which category Pandas / numpy are in)

[0]: https://github.com/phantomics/april