Hacker News new | ask | show | jobs
by cduzz 617 days ago
I'll sometimes gauge code complexity by comparing the number of lines of code against the output of

  tar -cf - . | gzip | base64 | wc -l
IE "how much does it compress?"

Looking at APL -- I'm reminded of what happens if I accidentally send the gzipped output to my tty...

I'm impressed that there's anyone who can follow along (can you find the bug?) to code like

p←{(↑⍵)∘{(⍺∨.=⍵)/⍳n×n∘}¨,⍵},(n*÷2){⍵,⍺⊥⌊⍵÷⍺}'⍳n n←⍴⍵

It really feels like compressed binary data where everyone's got a copy of the dictionary already...

3 comments

Legitimately curious how APL programmers think about maintainability and readability. Is code just thoroughly commented or otherwise documented?
The most uncompromisingly APL-ish code I've written is the BQN compiler[0]. Hard to write, hard to extend, hard to refactor. I generally recommend against writing this way in [1]. But... it's noticeably easy to debug. There's no control flow, I mean, with very few exceptions every line is just run once, in order. So when the output is wrong I skim the comments and/or work backwards through the code to find which variable was computed wrong, print stuff (possibly comparing to similar input without the bug) to see how it differs from expectations, and at that point can easily see how it got that way.

The compiler's whole state is a bunch of integer vectors, and •Show [a,b,c] prints some equal-length vectors as rows of a table, so I usually use that. The relevant code is usually a few consecutive lines, and the code is composed of very basic operations like boolean logic, reordering arrays with selection, prefix sum, and so on, so they're not hard to read if you're used to them. There are a few tricks, which almost all are repeated patterns (e.g. PN, "partitioned-none" is common enough to be defined as a function). And fortunately, the line prefaced with "Permutation to reverse each expression: more complicated than it looks" has never needed to be debugged.

Basically, when you commit to writing in an array style (you don't have to! It might be impossible!) you're taking an extreme stance in favor of visible and manipulable data. It's more work up front to design the layout of this data and figure out how to process it in the way you want, but easier to see what's happening as a result. People (who don't know APL, mostly) say "write only" but I haven't experienced it.

[0] https://github.com/mlochbaum/BQN/blob/master/src/c.bqn

[1] https://mlochbaum.github.io/BQN/implementation/codfns.html#i...

God bless, my hat goes off to you sir. I have trouble wrapping my head around the concept of first class functions in ndarrays, let alone implementing it in hardcore APL. That has to be a feat on par with Hsu's Co-Dfns.

Don't suppose you can point to any resources to help wrap your head around BQN, do you?

Well this is pretty much the goal of the BQN website so my best attempts are there. I might point to the quick start page https://mlochbaum.github.io/BQN/doc/quick.html as a way to feel more comfortable with the syntax right away. And the community page https://mlochbaum.github.io/BQN/community/index.html collects links by others; Sylvia's blog in particular focuses on the sorts of flat array techniques that are useful for a compiler.
While I've seen BQN mentioned previously on pages that discuss APL, K & J I finally took a look at it.

I've got to say, it's a really impressive language. Very well thought through, it brings some nice ideas. And as someone still newer to the space, it seems to do a great job of eliminating some of the unnecessary complexity of other languages. The straightforward approach on syntax / parsing is really fresh air.

Just looked at the github -- wait, you wrote BQN? My God. Is there any prior art on this -- arraylangs with first class functions? I don't think very many people realize how incredible the semantic power of BQN is. The idea of an arraylang with first class functions... it truly staggers the imagination.

I feel like if I were able to wrap my head around it I would never want to code in anything else. Thanks again and excited to take another look at it!

K, for a start. Whitney's earlier dialect A+ too. See https://aplwiki.com/wiki/First-class_function .
Don't most array languages have first class functions?
What is this witchcraft? I fear that I have seen something that I cannot unsee...
Once you've learned the syntax of the language, long expressions like that are about as readable as however-many-dozen lines of JS/Python with 1-to-3-character variable names; i.e. some parts may be obvious if they're a common pattern or simple enough, but the big picture may take a while to dig out.

Probably the biggest readability concern of overly-golfed expressions really is just being dynamically typed, a problem shared with all dynamically-typed languages. But array languages have the problem worse, as nearly all operations are polymorphic over array vs number inputs, whereas in e.g. JS you can use 'a+b' as a hint that 'a' and 'b' are numbers, & similar.

If you want readable/maintainable code, adding comments and splitting things into many smaller lines is just as acceptable as in other languages.

I am kind of curious if you have to mentally keep track of the rank/shape/dimensions in your head or if there is some implicit/explicit convention for conveying that to the reader. Does tracking rank/shape become second nature after awhile?

I'm also wondering about things like (APL-style) inner products -- they are undeniably powerful, but it's hard for me to conceptual use cases above rank 3.

That depends on the specific code. Some code is written to be agnostic to the rank, while others make certain assumptions.

In my code I'd sometimes write assertions in the beginning of a function to not only ensure it's called with the right shape but also as documentation.

Also, in practice really high rank arrays aren't used much. Even 4 is pretty rare.

If there's information on input format, it is simple enough to trace through the following shapes, but it does force reading the code rather linearly. Operations which implicitly restrict the allowed shapes are unfortunately intentionally rather few.

I basically never use the generalized inner product; it's rather unique to the original APL - J has a variant that doesn't have the built-in reduction, and k and BQN and many if not most other array languages don't have any builtin for it at all. And in general I don't typically use rank higher than like one plus the natural dimensionality of the operation/data in question.

I programmed in APL a long time ago... even got 'not bad' at it.

The best analogy i can give of my thought process is that first i unfolded the problem into one or more many-dimension object(s) ... then took a different "stance" of looking at the object, then refolded them into the final solution.

So yes... I had it all in my head at some point.

You don't really have to worry about keeping track of tons of functions, variables, structs, classes, etc., and trying to keep all the names straight in your head - all you need is to know the symbols, so it's in some ways easier than reading a complex function in more verbose languages where you might need to lookup stuff from several libraries just to understand what's going on. Also, that one line is ~100 characters, each of which probably covers ~0.5-1 lines in other languages, so you should expect to set aside a similar amount of time to reading and understanding it.
I suspect that if you're fluent in the language, understanding an expression written in it comes just as easily and quickly as reading a sentence in a book does to me.
Information density is studied in linguistics. It could likely apply to programming languages similarly.
That’s exactly what they say. Though most kdb I’ve see in business looks more like Python.
my impression is that the language is used more for scripts than for "code" in a true sense. A bit of "how much can you juggle in your mind" going on
i've only seen these style of languages commented after a contest is over on stack programming challenges. I have no idea how one would learn all this stuff from code in the wild (like i learned most of python, for example). then again, i don't go searching github for k, apl, or perl for that matter.

I'm sure each of those languages makes some guarantee about the sorts of errors that can be introduced - as opposed to C (let me pick on it) where the errors you know you can introduce, and the errors that are introduced aren't a large union. However i have a hard enough time typing english consistently, so the various "symbol-y" languages just glaze my eyes, unfortunately.

It almost "feels" like these languages are an overreaction to the chestnut "they must get paid by LoC".

Late to the game here but...

> can you find the bug?

Several stand out immediately:

- Two syntax errors: unclosed single quote in '⍳n n←⍴⍵ and no right operand in the second use of Jot (∘). It's not clear how those could have snuk in naturally by accident, but I'll just assume cosmic rays and that they should be simply elided.

- n n←⍴⍵ is setting n twice, which is a bit surprising, though it signals that you probably expect ⍵ to have rank 2. In such cases _ n←⍴⍵ or n←⊃⌽⍴⍵ may be more natural, depending on intent.

- However, Decode (⊥) will error if ⍴⍵ returns anything other than a single integer (or an empty vector), so n n←⍴⍵ is equivalent to just n←⍴⍵ and doubly confusing.

- Which means that (n*÷2){⍵,⍺⊥⌊⍵÷⍺}⍳n n←⍴⍵ can only return a vector, i.e. 1..n with a number tacked on the end: the value of (1-x^n)/(1-x) evaluated at sqrt(n), which is a bit of a strange data structure IMHO. Something to do with geometric series of n^2?

- The second use of Ravel (,) in ,⍵ is redundant, and given the constraints we know above, so is the first use: ,(n*÷2)...

- It also means that (↑⍵) is the same as just ⍵

- But then (⍺∨.=⍵) is always just 1

- Meaning that the whole code is essentially equivalent to p←(n+1)⍴⊂⍳n×n←⍴⍵. I.e. it just outputs n+1 vectors of the integers 1 to n^2.

- Which, without context, is hard to guess intent, but that data structure feels a bit strange. Instead of a vector of uniform-length vectors, a matrix would be more efficient: (n+1)(n*2)⍴⍳n×n←⍴⍵. But that's just a matrix with rows that are all the same, so maybe we could just use the single vector (⍳2*⍨⍴⍵) directly?

Really, despite looking strange, once you learn the symbols and basic operations, APL is surprisingly straightforward. If you're on HN, then you're already smart enough to learn the basics easily enough.

Admittedly, though, becoming proficient in APL does take some time and learning pains. Once there, though, it does feel like a superpower.

I'm not sure why it would be any more impressive or surprising than the billions of people who read and write in non English alphabets
That's a really good point...

But -- (and forgive me if I'm totally wrong) -- this isn't just "non-english" but "non-phonetic" which is a smaller set of written languages, and the underlying language is ... math.... so understanding the underlying grammer itself relies on having decades of math education to really make it jive.

If this code is just a final result of "learn math for 2-3 decades, and spend years learning this specific programming language" -- my statement stands. Interacting with this kinda binary blob as a programming language is impressive. I think I read somewhere that seymour cray's wife knew he was working too hard when he started balancing the checkbook in hex...

The underlying language isn't really very mathematical, at most there's a bit of linear algebra in the primitives but that's it. You certainly don't need any sort of formal maths education to learn APL. There are about 50 or so new symbols, which is not a big ask, with any sort of focus the majority of the syntax etc can be learned very quickly. The "bugs" in your original code stand out very clearly because things like "∘}" don't make sense, ∘ being "dyadic" (infix).
and it bears mention that a decent chunk of those symbols are things nearly everyone is familiar with from other languages (+, -, =, etc), symbols you've probably seen in math class or on your graphing calculators (÷, ×, ≠, ⌈, ←, etc), and symbols with very strong mnemonic associations once you've seen them explained (≢, ⍋, ⍳, ⌽, etc).