I had thought of APL as something from computing pre-history, with its bizarro custom keyboard, but I learned that APL and other array languages are apparently alive and well. Will subscribe to the podcast.
Two quotes the hosts brought up stuck with me:
(at 15:05) "A language that doesn't change the way you think is not a language worth learning". From Alan Perlis [1], and his Epigrams in Programming (#19) [2]
(at 16:49) "it is a privilege to learn a language/ a journey into the immediate". From poet Marilyn Hacker [3]; totally captivating idea, even if not not about programming languages [4]
k (and the closely related q) is the main language used in industry, particularly at investment banks and hedge funds. It can be a bit of a shock to realise there are people in London earning in excess of £1000/day (pretty good for London) working in a language where well-written code looks like this[1]:
us:{$[#i:&{(y~*K)&"*"~\*x}':x;@[x;i;:[;,"_"]];x]}
It's like discovering a whole different world of software development. Also I don't use that example to disparage k, I have come to appreciate the array language way-of-working. It just looks very alien.
It's like someone threw up the noise that modems make during initial connection onto an electric typewriter from the 1960s, and then explained their intention using quotes from a Lovecraft novel.
I am going to tell you something fantastic, but first, I want to explain some things about this:
us:{$[#i:&{(y~*K)&"*"~*x}':x;@[x;i;:[;,"_"]];x]}
The first is that there's a typo in what bidirectional wrote. The above is correct. The second, is what it is. Once I have explained that, I can tell you the fantastic thing.
k syntax is very simple. There's just a few forms you need to be aware of:
f x
which applies x to f.
a f b
which is apply f to the two arguments a and b, and:
f[a;b;c]
which allows you to do three arguments. You can write the first one as f[x] and the second as f[a;b] if you like even more consistency. f can be an "operator" -- that is a symbol. The symbols ' / and \ are special and called adverbs. These adverbs have a special form if followed by a colon, so ': is different than ' and has nothing to do with : or '. I think Arthur just ran out of keys on the keyboard. Once you have those, parenthesis () and braces {} have some special syntax, just like double-quotes " do.
With the syntax explained, let us try to understand what we are looking at.
us: is how we start assignment. You can say "us gets" if you like (the colon can be pronounced). {} braces surround a lambda, this one takes a single argument "x" (the first argument). $[a;b;c] is cond like in lisp; if a then b else c. # means count. i: is another assignment.
& means where -- the argument to which is going to be a bitmap like 000100b or 01101b or something like that, and where returns the indices of the set bits; the former example being the list 3, the latter example being the three-element list 1 2 4.
Another lambda comes next: We can see it takes two arguments because there's an x and a y in there (y is the second argument). We can get a clue as to what it expects because the following adverb ': means each-prior. This tells us "x" is going to be a list of things, and this lambda is going to consume them pairwise. If given the list {(x;y)}':"iliketacos" we get the result:
{(x;y)}':"iliketacos"
i
li
il
ki
ek
te
at
ca
oc
so
y is the "previous" value, and "x" is the current value. The "where" before it tells us we want to know the indices where the condition inside is true. Let's try and understand that condition.
y~*K is in parenthesis. Parenthesis group, so we execute them first (just like in other languages). We're looking for a situation where the previous value is the first (that's what asterisk means here) of K. What is K?
So we're looking for a value (x) whose previous (y) is the first of K which is "select". The "&" that follows here is "and" - Arthur likes to overload operators since there aren't many symbols on the keyboard and this is something you get used to.
So you can read {(y~*K)&"*"~*x}':x as simply trying to find the sequences "select star" -- given a list ("select"; "*"; "from"; "potato") you get 0100b and from ("select"; "*"; "from"; "("; "select"; "*"; "from"; "potato; ")") you get 01000100b. I think the attempt is to disambiguate the asterisks in the sql:
select * from tacos where cat=4*42
but sql is a strange and irregular language, so this kind of thing is necessary. Back to our query:
us:{$[#i:&{(y~*K)&"*"~*x}':x;@[x;i;:[;,"_"]];x]}
i is going to be the locations of the asterisks following select. If the count of that is nonzero; we're going to do the @-part, and if not, we're just going to return x.
@[x;i;f]
is called amend. It returns x, but at indices i, we apply them to f, so it's x[i]:f[x[i]] which is pretty cool. f in this case is a projection, of "gets" (the function colon) with the second-argument bound to an underscore. That is:
:[;,"_"]
is just a function. That's how @[x;i;:[;,"_"] replaces all of the asterisks that follow select with an a "_"
Almost. It's actually a list of length one, rather than the scalar "_". I haven't read everything in sql.k but this is probably important elsewhere.
Ok. Now that I have explained what this is and what it does, I am ready to tell you something fantastic. I read this:
us:{$[#i:&{(y~*K)&"*"~*x}':x;@[x;i;:[;,"_"]];x]}
as:
"us gets a function, that finds the indices of asterisk following the first element of K, and then replaces the things at those indices with underscores"
Literally. From left, to right. Just that fast. And I only program in k part-time. That's not the fantastic thing. The fantastic thing is that by learning to read k, I am almost miraculously able to read other languages faster. This:
copied=False
for i in range(1,len(a)):
if a[i] == "*" and a[i-1] == "select":
if not copied:
copied = True
a = a[:]
a[i] = "_";
gives me some grief for being so irregular and gross, and I have to look up range/xrange and len and memorise a much more complex set of rules for syntax, and I have to track the order of things carefully and so on, but I have places in my brain, made by k, for those things, and so I am able to absorb code in other languages faster.
If that does not amaze you, I do not think you have considered the ramifications of what I said. I can suggest maybe reading it again (or maybe actually reading what I wrote instead of skipping to the punchline), but if after two or three tries you are still lost, maybe you can ask a question and I can try to answer it.
let mut input = ["select", "*", "from", "potato"];
for i in 1..input.len() {
if ["select","*"] == input[i-1..=i] {
input[i] = "_";
}
}
If I could be bothered to dig up a Haskell compiler, I'm sure it's possible to do a one-liner list comprehension that is both terse and readable.
If I was doing this in Rust, it's easy to create an iterator extension that does something like "map_lookback" which explains the intention without requiring comments.
I'm going to be blunt: Unreadable array languages are popular with quants because they work in a highly competitive, cut-throat industry where "write only" languages provide job security. I've come across developers purposefully obfuscating code by using only one-character identifiers and zero comments. One of them literally blackmailed their employer, demanding their salary be doubled on the grounds that there was no chance their code could be maintained by anyone else. He should have gone to jail for that, but he had his managers by the balls, got what he wanted, and bought a house with cash soon after.
Yes, to you. Just like Chinese would be infinitely less readable to you, if you don't know Chinese.
Do you think this is a deep observation?
The real challenge is knowing whether it is worth it to learn k so that it becomes readable.
- How many characters is it? This is a useful metric when you realise bug/defect rate is proportional to the physical size (in rows and columns of source code) of a program given equivalent processes (Moore 1992; McConnell 1993) but that process matters more than anything else. Not tooling, not "memory safety", and certainly not "readability" by people unfamiliar with the language.
However "readable" you think your rust code is, did you notice the bug in your rust code? (hint: input is not supposed to be mutable)
- How fast is it? On my 2014 i7 macbook air, the supplied us averages 232 msec for 100k cycles. My version
us:{$[x~(z;y);,"_";y]}[(*K;,"*")]':
is faster: 126msec for 100k cycles.
- How quickly was it written? I can't speak for sa/atw on us since I didn't see them write it. I wrote mine in about 30 seconds including testing. Yes really. How long did your rust program take to write? Did you think simply because you thought you understood the requirements that you didn't need to test it?
- How quickly can someone familiar with the language read it? Again, I can't speak for anyone other than myself, but I'm not a k expert -- I program in it very infrequently, and the new amend-syntax in k9 I had not run across previously. And yet I read it as quickly as I said.
These four values (less code, fast run, fast write, fast read) are the biggest most important things to me. And anyone who shows me all four of things will get my attention.
Rust? Simply does not impress.
> Unreadable array languages are popular with quants because they work in a highly competitive, cut-throat industry where "write only" languages provide job security.
That's another interesting opinion. This one might even be true amongst some quants (Most of the ones I know that use k don't particularly like k). But I suggest you try not to expect the worst in people. Yes, some people are assholes, but most people aren't. And for what it's worth, I'm not a quant (I work in Advertising).
1> (defun rewrite (fun list)
(build
(while* list
(let ((nlist [fun list]))
(if (eq list nlist)
(if list (add (pop list)))
(set list nlist))))))
rewrite
2> (defmacro rewrite-case (sym list . cases)
^(rewrite (lambda (,sym)
(match-case ,sym
,*cases))
,list))
rewrite-case
3> (rewrite-case x '(foo bar * select * fox select * bravo)
((select * . @rest) ^(select _ . ,rest))
(@else else))
(foo bar * select _ fox select _ bravo)
rewrite-case and rewrite appear in the TXR Lisp internals; they are used in the compiler for scanning instruction sequences for patterns and rewriting them.
E.g. a function early-peephole looks for one particular four instruction pattern (six items, when the labels are included). Rewriting it to a different form helps it disappear later on.
This is much more general and powerful than a hack which just looks at successive pairs for an ad-hoc match. rewrite-case can have multiple clauses, of different lengths, and arbitrary matching complexity.
The original requirements should be addressed. The thing being matched is not just select, but actually any one of a set of symbols that appear in K.
We can stick that data into the pattern matching syntax using the or operator:
3> (rewrite-case x '(foo bar * select * fox where * bravo)
((@(or select distinct partition from where
group having order limit) * . @rest) ^(,(car x) _ . ,rest))
(@else else))
(foo bar * select _ fox where _ bravo)
Or put it into a variable:
4> (defvarl K '(select distinct partition from where group having order limit))
K
5> (rewrite-case x '(foo bar * select * fox where * bravo)
((@(member @sym K) * . @rest) ^(,sym _ . ,rest))
(@else else))
(foo bar * select _ fox where _ bravo)
"If an object sym which is a member of K is followed by * and some remaining material, replace that by sym, underscore and that remaining material."
Hash table:
6> (set K (hash-list '(select distinct partition from where group having order limit)))
#H(() (select select) (where where) (limit limit) (order order)
(having having) (distinct distinct) (partition partition) (group group)
(from from))
7> (rewrite-case x '(foo bar * select * fox where * bravo)
((@[K @sym] * . @rest) ^(,sym _ . ,rest))
(@else else))
(foo bar * select _ fox where _ bravo)
No. I flash brackets, but I tend to turn off other forms of syntax highlighting. I find it extremely distracting when the syntax highlighter "decides" wrong, and I've become convinced comments at the end of lines like //} to "fix" the highlighter deal with complicated stuff is hurting more than it's helping.
> The claim that you have to look up len seems disingenuous
In what way?
I have to look up "how do I get the indices of a list" to get range(1,len(x)) -- I think I could have also used enumerate() and a bunch of other things, but this seemed the shortest.
> All you're communicating here is that you don't regularly work with Python.
I hope I'm communicating more than that because I put a lot of effort into my comment. I don't regularly work with k either.
Let's do this using the same approach, "find indices and assign over them in a copy of the sequence":
This is the TXR Lisp interactive listener of TXR 259.
Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
Do not operate heavy equipment or motor vehicles while using TXR.
1> (defun subst-select-* (list)
(let ((indices (where (op starts-with '(select *))
(cons nil (conses list)))))
(if indices
(let ((list (copy list)))
(set [list indices] (repeat '(_)))
list)
list)))
subst-select-*
2> (subst-select-* '(foo bar * select * fox select * bravo))
(foo bar * select _ fox select _ bravo)
(conses list) gives us a list of the list's conses: e.g in (1 2 3) the conses are (1 2 3), (2 3) and (3), so the list of them is ((1 2 3) (2 3) (3)).
where applies a function to a sequence, and returns the 0-based indices of where the function yields true.
(op starts-with '(select *)) yields a lambda which tests whether its argument starts with (select *). No brainer.
If we naively applied that to the conses, we would get thew rong indices: the indices of the select symbols, not of the asterisks.
The workaround for that is (cons nil (conses list)): we cons an extra dummy nil element to shift the positions, and process the resulting list.
Once we have the indices list, if it isn't empty, we copy the original input, and assign underscore symbols into the indicated positions. To do that we generate an infinite lazy list of underscores; the assignment takes elements from this list and puts them into the specified index positions.
If this were me, and I needed a function like us, I would have written this:
us:{$[x~(z;y);,"_";y]}[(*K;,"*")]':
I would be interested in seeing anything that was shorter[1] and faster than that in any language, and I would be very curious to learn from anyone who could also do that faster than me.
But I'm not a fetishist: I didn't learn k because it was cute, and I don't wake up every day looking for ways to rewrite other people's code so that it is slower and bigger. Do you? Or is today special?
As some one who only knows English. No that is not what Chinese and Japanese look like to me.
That's why I prefer APL and its special symbols. It's still utterly inscrutable if you don't know it but at it looks intentional. Those symbols shift your mindset or reframe what you're looking at.
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
Dismissing what you don't understand because it is unfamiliar is the essence of a shallow dismissal, no? And you did it twice in this thread. The first was a blessing in disguise because of geocar's excellent reply, but now you're just repeating it, with added name-calling. Please don't do that here.
Various parts, particularly in markets, from pricing quants to high-frequency traders. It is fairly widespread. Barclays, JPM, UBS, Morgan Stanley, HSBC are some of the big names, then you have loads of smaller firms.
I recall a discussion on array programming languages on here a while back where someone claimed that an acquaintance of theirs was earning nearly 7 figures working on q/kdb+.
The latest language which fits quote 1 for me was Haskell. Even though I already had some functional background (Lisp), it took me seemingly forever to actually grok purely functional programming. But once it clicked, it felt like stepping up on a ladder. My perspective on other languages changed as well.
I have a common-lisp background, and I learned Haskell at the insistence of a former colleague. I also know k, and have since learned some APL and j. I would like to try and suggest to you my perspective:
The jump from Python to Haskell - or really anything along that way is like talking about a ladder of computing. You start at one end, and you are climbing upwards. And every step you take, you can look down and see all of the things you knew before, but with greater perspective.
And Haskell? Well, it's definitely pretty far up the ladder. If you get Haskell, you feel like you really understand what's going on. I know pg was talking about lisp when he was thinking blub, but in some blubish respects, Haskell is a better lisp than lisp.
But see, going from Haskell (or really anything) to Iverson is like, listen: Forget the ladder, because a ladder only goes up and down. Iverson is sideways. It is in this way, like adding depth to flatland, that Arrays are an even bigger deal than you can possibly imagine until you go there.
For me it was Prolog. I came to a class, which used Prolog, with a bad attitude of "whatever I can think of, I can program in C". Luckily for me I was schooled.
For my was Rust. I done like 12(?) Langs before, including F# that also was a change of mind, but the first 2 or 3 months of Rust I fell like an idiot looking intensely to a wall. I start to think my 20 years programming were a big fat lie.
I can't believe why it feels so hard? !I already know pascal and obj-c and F#!, kind of similar, no?
Now I feel rust so easy (as python easy!) that is weeeeeeird. (btw: I think is months now where I never think I have meet an error or situation that truly confuse me).
These days having good video and audio quality doesn't require a lot of money. It just requires couple of hours educating yourself online what you need and how to use it.
This is very exciting. I’ve been really inspired by the passion Conor Hoekstra has for APL and J. His other podcast (Algorithms + Data Structures = Programming) is a lot of fun, and his YouTube videos are very educational, but I’ve really wanted something like this where he can interact with other experts outside of the C++ world.
So this thread's podcast of 52 minutes of a complex technical topic with multiple speakers could cost ~$200. A programming-related podcast is already a niche topic with a tiny audience and an Array Languages podcast is an even tinier subset of that so the cost might not be justified.
I suppose podcasts could be uploaded to Youtube and let their speech-to-text algorithm do an auto-transcribe. However, the A.I. algorithm is not good at tech topics with industry jargon/acronyms and the resultant transcription will be inaccurate.
I make transcripts of all my work using Descript. It uses Google's speech-to-text algo (same as the one in youtube presumably) and gives you a transcript you can then edit. It costs $15/month I believe, and you have to spend some time editing the transcript that realistically won't be read by many, but it works pretty well ime (no affiliation besides being a happy customer)
Right. A high-quality podcast is already lots of pre- and post-production work on just the audio. I use Rev which hires captioners on my behalf [0] but it's also expensive. I use it sparingly.
Chrome now provides on-device powered live captions (which hooks into any chrome originating audio) - chrome://settings/accessibility -> toggle "Live Captions"[1] which could help alleviate some of the limitations for audio impaired viewers
>Chrome now provides on-device powered live captions [...] which could help alleviate some of the limitations for audio impaired viewers
That's a great feature! But it also highlights the limited accuracy of the AI machine learning algorithm for technical topics with jargon. E.g., at 27m00s, the caption algorithm incorrectly transcribes it as as "APL is joked about as a right only language" -- but we know the speaker actually said, "APL is joked about as a write-only language". And the algorithm incorrectly transcribes "oversonian languages" when it's actually "Iversonian languages".
The algorithm also doesn't differentiate multiple speakers and the generated text is just continuously concatenated even as the voices change. Therefore, an audio-impaired wouldn't know which person said a particular string of words.
This is why podcasters still have to pay humans (sometimes with domain knowledge) to carefully listen to the audio and accurately transcribe it.
Samsung's bastard version of Android had a similar "Automated Subtitles" feature. It's decent for watching videos with the phone on silent, but it's pretty crap when there are lots of proper nouns and unusual jargon, as I imagine this podcast has.
I listened to the first episode and it was really interesting to listen to really experienced programmers share the things they like the most about array programming and the array-oriented languages they are most familiar with.
I think the goal in bringing in a C++ programmer was to provide an outsider view, but no, no Fortran guy. I don't think Fortran is an array language in that arrays are not the only datastructures in Fortran, though?
Fortran isn't an array language at all, really. Maybe some features of the array languages style are super-imposed atop it with parallizing extensions, dialects, or whatnot; but it's style has always been and still is to write explicit loops and mutate state left and right. John Backus in his famous 1977 turing award speech explicitly named it a representative of the "fat weak" languages he talked of and said it was a thin veneer over assembly, he based his fictional FP language on APL instead.
Maybe grandparent meant it has the same _application_ as array languages, in that its only surviving kingdom is scientific computing where devouring gargantuan arrays of numerics is the only thing that matters, unlike C or C++ that are much more widely used. Maybe that's why it has a long history of being parallized with various tools and runtimes _despite_ its inherent imperativeness. (I don't remember where I heard this, but somebody wrote an analyzer to analyze some Fortran programmes in the 90s and found that over 80%/90% of Fortran programmes are spent doing variations of map, filter and reduce. So it's an extremely imperative language against its intended use. Maybe I will post a link in an edit when I find the source of the claim)
Fortran has had built-in array operations since the Fortran 90 standard. If X and Y are scalars or arrays of the same shape, you can X+Y, X*Y, exp(X), sin(X) etc. You can define your own elemental functions that act on both scalars and arrays of any rank. I still write loops when programming in Fortran 95 but less often than in Fortran 77. So I think modern Fortran is an array language.
No, an array language is one in which all the built-in operations are applicable to arrays. Take addition as an example:
4 + 4
8
In array languages the same operator can be used for arrays; or equivalently you can say that the example above sums two arrays of length 1. In J, you can do this and expect it to work:
4 4 4 + 2 2 2
6 6 6
This is true for all the built-ins, and many user defined operations (as long as you don't fiddle with so called rank of the verb you're defining).
Numpy is close to that, but there's still a distinction between arrays and scalars, while in array langauges that distinction is often blurred:
4 + 1 2 3
5 6 7
Edit: in J, you can have atoms, or scalars, but you need to box them:
(3;4;5)
┌─┬─┬─┐
│3│4│5│
└─┴─┴─┘
but then you can't do anything with them until you unbox them again:
That's the case in APL and J. K uses nested lists to represent arrays, and has non-lists (atoms). But the convention is that an n-times nested list is considered an n-dimensional array so even an atom is an array, with 0 dimensions.
Numbers (and character) are implemented as arrays with 0 dimensions. Text would be an array of characters with 1 dimension (the number of characters), and generally speaking the dimension of an array is a one dimensional list of non-negative integers. Many array languages also include an array type which is approximately the same as a C pointer to an array, with a bit of jargon thrown in to distinguish a reference to an array from the array itself.
Something like an SQL table in an array language would be implemented as a list of columns (rather than as a list of rows) and a corresponding list of column labels. This has some interesting benefits.
That said, functions in array language are typically not arrays (though they presumably would have a textual representation). So... not everything is an array.
True, but I also feel like array programming is seen as less prevalent in industry than it really is due to a lack of online community (e.g. Haskell as a language probably has an order of magnitude more content online, let alone the functional paradigm).
In any financial centre there are hundreds of (often very well-paying) jobs using these languages (well mainly k and its ilk).
Two quotes the hosts brought up stuck with me:
(at 15:05) "A language that doesn't change the way you think is not a language worth learning". From Alan Perlis [1], and his Epigrams in Programming (#19) [2]
(at 16:49) "it is a privilege to learn a language/ a journey into the immediate". From poet Marilyn Hacker [3]; totally captivating idea, even if not not about programming languages [4]
[1] https://en.wikipedia.org/wiki/Alan_Perlis [2] https://cpsc.yale.edu/epigrams-programming [3] https://poets.org/academy-american-poets/winner/prizes/james... [4] https://www.enotes.com/topics/marilyn-hacker/critical-essays