Hacker News new | ask | show | jobs
by jcrites 1145 days ago
Not knowing APL, I unfortunately find the beginning code completely unreadable:

  ⍸(⊃⍵)@0⊣¯1⌽0 1⍷⍵
[Edit: My thought process here is: "This syntax is so difficult that I can't even begin to reason about it even using web search. Can GPT understand it? Yep, it seems to."

"Could this algorithm be more clearly expressed in another language? Yes, it looks like it."]

----

I asked ChatGPT (4) to explain this expression (Prompt: "Explain what this APL code does: ___"). In case others find it helpful, the response is below.

First, let's break down each symbol:

1. ⍵ represents the right argument of a function.

2. ⊃ is the disclose or first function, which picks the first item of its argument.

3. ⍷ is the "where" function, which returns a boolean mask of where its left argument appears in its right argument.

4. ⌽ is the reverse function. With the argument ¯1, it rotates its argument one place to the right.

5. @ is the "at" function, which modifies certain elements of an array. (⊃⍵)@0⊣ replaces zeros in its right argument with the first element of ⍵.

6. ⊣ is the left function, which returns the left argument, ignoring the right one.

7. ⍸ is the "where" function, which returns the indices where its argument is true (or in the case of numeric data, non-zero).

Now, let's combine these to understand the whole expression:

0 1⍷⍵ returns a boolean mask of where the sequence 0 1 appears in ⍵.

¯1⌽ then rotates this mask one place to the right.

(⊃⍵)@0⊣ replaces zeros in the rotated mask with the first element of ⍵.

Finally, ⍸ returns the indices of non-zero elements in the result.

So, this code is finding the indices in the array ⍵ where a 0 is immediately followed by a 1, and replaces these 0s with the first element of ⍵. It then returns the indices of the non-zero elements in the result.

[Edit: I'm not entirely sure but this explanation might contain an off-by-one error. I don't quite follow APL enough to tell. The translated alternative does for sure though.]

----

On that note, I'll remark that I find APL's syntax to have distinct disadvantages. I know a number of programming languages well, having written production code in them for decades. I've come across basically every mainstream programming language and find them generally readable, even though I don't have much domain knowledge, because they share so many concepts that are typically reflected with similar syntax (for loops, assignments, functions, etc.) Even relatively more "out there" languages like Haskell and Perl are somewhat readable without a ton of domain knowledge.

I find it very difficult to get on board with languages that value terseness of this sort over readability (like in K and APL – especially with APL's special symbols). Code is read so much more than it's written, in general, that transparency is valuable and important – as well as the ability to search for code fragments to understand their meaning.

I asked ChatGPT4 to translate this code into Rust, for example, to see what it would come up with. Here was its literal translation, that relies on mutating the input (analogous to the APL code). This is similar to what you'd get transliterating the code into C/C++/Java:

  fn find_zero_one_and_replace(input: &mut Vec<i32>) -> Vec<usize> {
      let first_element = *input.first().unwrap_or(&0);
      let mut indices = vec![];
      let mut prev_zero_index = None;
  
      for (i, &item) in input.iter().enumerate() {
          if item == 0 {
              prev_zero_index = Some(i);
          } else if item == 1 {
              if let Some(index) = prev_zero_index {
                  input[index] = first_element;
                  indices.push(index);
                  prev_zero_index = None;
              }
          }
      }
      
      indices
  }
[Edit: I haven't checked whether this is correct]

After some back-and-forth, asking it to write a function that doesn't mutate the input, and uses iterators, it comes up with some code that I think is probably readable even if you don't know the language well (barring some particulars of the syntax):

  fn find_zero_one_indices<'a>(input: &'a [i32]) -> impl 
  Iterator<Item = usize> + 'a {
      input.windows(2)
          .enumerate()
          .filter_map(move |(i, window)| {
              if window == &[0, 1] {
                  Some(i+1)
              } else {
                  None
              }
          })
  }
[Edit: This code fails to detect the leading group of `1s` in the input and return a response beginning with `0`. Corrected code might be:]

  fn find_zero_one_indices<'a>(input: &'a [i32]) -> impl Iterator<Item = usize> + 'a {
      let mut previous = 0;
      input.iter().enumerate().filter_map(move |(i, &x)| {
          let result = if previous == 0 && x == 1 { Some(i) } else { None };
          previous = x;
          result
      })
  }
[Edit: The corrected form is arguably less readable than the original due to the need to detect a leading window that doesn't start with `0`. This version somewhat weakens my argument about the clarity of the alternative, but this is still readable. Perhaps someone else with more Rust experience can suggest a simplification.]

I find this readable since "windows" here is a direct reference to the concept of windowing, which is collecting sequences of items from a stream into groups. "Enumerate" iterates over a pair of sequence elements and their indexes: `(i, window)`. These concepts exist in a number of languages.

But if you didn't know what these code elements meant, you could search for them: "rust windows", "rust enumerate", "rust filter_map" and get useful results. The stdlib documentation will be the first or second link. For example, if you don't know what "filter_map" does: https://doc.rust-lang.org/std/iter/trait.Iterator.html#metho... (And the same would be true for Java, Python, etc.)

Each of these elements (windows, enumerate, filter) is syntactically joined into a conceptual pipeline, providing a framework for reasoning about how they work, in a paradigm used by many languages.

Finally we have the detection logic: `if window == &[0, 1]`. Hopefully that's readable, and I expect makes this solution easier to understand than any of the other variants.

I may have somewhat missed the point of the article: it goes on to describe how the solution can be generalized in various ways. But this is also likely true of the canonical solutions in many languages: iterators (in particular), anonymous functions, etc. enable code snippets to compose and generalize in many ways as well.

1 comments

Why has this thing of posting unchecked GPT4 output become popular? It's infuriating. The explanation is wrong, but who cares right?

Oh great I see now they've added yet another APL-on-HN trope: the 'equivalent' (note: it's wrong too!) code in some trendy language (and saying APL/K are unreadable). It's like someone crafted the perfect message to piss me off.

I checked it insofar as I understand APL and what the algorithm was attempting to do: that the algorithm was attempting to detect and return the index of the first element of every consecutive "[0 1]" pair.

[Edit: OK, I see the mistake is that it should return the index of the `1`, not the `0`. Yes, that's the kind of mistake I'd catch during unit testing, or after more carefully reviewing the code.]

In what way is its explanation wrong? Its summary of the algorithm is:

> So, this code is finding the indices in the array ⍵ where a 0 is immediately followed by a 1, and replaces these 0s with the first element of ⍵. It then returns the indices of the non-zero elements in the result.

That summary seemed to correctly describe the algorithm, and seemed consistent with the article, so I assumed that its breakdown was correct. I don't understand the individual symbols or operators of APL and can't quickly fact-check the answer, but given that the summary was consistent with the article, it seemed to indicate that GPT had deduced (with no context from the article) what the algorithm was doing in human terms – in other words, whether it was able to "read" the APL code.

Whether it's actually mutating the input or not, I don't know. Its description of the transformation seemed to check out, though.

Regardless of whether its breakdown is right or wrong, my true intended commentary is about the opacity of the syntax and the difficulty in reading it (which I elaborated on via an edit, after your comment was posted but before seeing it).

> can't quickly fact-check the answer

If you would like to, visit https://tryapl.org/ and enter:

      ⎕IO ← 0
      bools ← 1 1 1 1 0 1 1 1 0 1 0 1 1 0 0 0 1 1 1 1
      {⍸(⊃⍵)@0⊣¯1⌽0 1⍷⍵} bools
The website will only(I think) let you copy-paste line by line, or it has the symbols in a bar along the top you can click on to enter them, and it has a backtick prefix system for typing them, so `L will enter the quad square on the first line and `[ enters the left arrow. Left arrow is variable assignment and the variable with a quad is a system one. The top line sets array indices count from zero instead of the default indexing from one. The curly braces {} make the code into a scriptblock/anonymous function which is immediately executed on the argument "bools".

If you want to see it building up stage by stage it works from the right, these give the intermediate steps:

    {0 1⍷⍵} bools

    {¯1⌽0 1⍷⍵} bools

    {(⊃⍵)@0⊣¯1⌽0 1⍷⍵} bools

    {⍸(⊃⍵)@0⊣¯1⌽0 1⍷⍵} bools
> "Regardless of whether its breakdown is right or wrong, my true intended commentary is about the opacity of the syntax and the difficulty in reading it (which I elaborated on via an edit, after your comment was posted but before seeing it)."

Which is a bit of a shame because the point of the article is at the end: "I do not believe that you can leverage this kind of suggestivity with languages that are more verbose. [...] It is fundamentally easier to try out many single line expressions than it is to try out many 15 - 30 line expressions. [...] in the same way that any technique that reduces the length of the feedback cycle improves our ability to iterate and therefore learn.".

It's not saying that APL is easy to read, it's claiming that learning to read it/think in it has benefits that don't apply to longer languages - more forest less trees, more design less bricklaying. Taking that away into "I never learned it and it looks hard" is not going with the more interesting part of the claim - like, would you rewrite your Rust code to detect the end of groups, then compare both versions and see they are annoyingly clunky, then search for a version which is more pleasing and suggests variations which may do other useful or interesting things with the groups? Even if you did, doing so may be harder because the Rust code is so much longer (more effort to compare in your head) and you would be less able to because the Rust code took more time to write so you have less time left over.

[This is rather more for the world of doing a mathematical puzzle, seeing two equations look similar and realising that the puzzles share some common underlying thing which you hadn't noticed before, than the world of writing production webservers with fearless concurrency].

<mean message edited out - despite me really wanting to be mean to someone who's been programming for longer than I've been alive but seems to lack reading comprehension>

You don't have to understand APL to know what it's attempting. From the article it clearly is meant to return [0,5,9,11,16], your rust returns [4, 8, 10, 15] (at least the second one, the first one doesn't even compile!).

> From the article it clearly is meant to return [0,5,9,11,16], your rust returns [4, 8, 10, 15].

You're right, I didn't test the code, nor review it especially closely.

I took a quick glance for a couple of seconds, checked that it compiled, glanced at the primitives it was using, and said "LGTM". I wanted to know "What general shape will this algorithm take expressed a different way?"

You shouldn't expect perfection from someone of any skill level posting or discussing topics casually like this. The amount of time and effort I'm willing to spend on comments is limited – limited enough to allow an off-by-one error to slip through. This is commentary in an Internet forum, not a code review for a production system, or a formal publication.

However, I think the nature of the error and its fix actually reinforces my point. If one did review the Rust code closely (which I didn't – I was more interested in GPT's capability to explain the APL syntax; and what the algorithm would look like expressed in another language), then a mistake would have stood out:

  .filter_map(move |(i, window)| {
    if window == &[0, 1] {
      Some(i)
If we want to return the index of the `1`, not the `0`, then `i` is the wrong index to return. I assume that's the mistake you're pointing out.

Yes, it's true that I didn't even realize that the original algorithm was returning the second index of these elements and not the first. I barely skimmed those details before becoming entirely distracted by APL's syntax itself, and then started to wonder about the various other ways that such an algorithm might be implemented in another language (and how clearly, subjectively). "The APL code is opaque. Will alternatives be opaque?"

Yes, a mistake like this would be trivially caught upon a close review of the algorithm or code, or on any unit testing.

However, I think it bolsters my point that the correction to the Rust code is easy to understand – both the mistake and the nature of the fix:

  .filter_map(move |(i, window)| {
    if window == &[0, 1] {
      Some(i + 1)
(The only thing that's not immediately obvious is why `i+1` is guaranteed to be a valid index – which it is.)

[Edit: Now that I've had time to actually run and test the code, a second error is that it omits the first group of `1`s. A corrected version is:

  fn find_zero_one_indices<I>(input: I) -> Vec<usize>
  where
      I: IntoIterator<Item = i32>,
  {
      let input: Vec<_> = std::iter::once(0).chain(input.into_iter()).collect();
      
      input.windows(2)
          .enumerate()
          .filter_map(|(i, window)| {
              if window == &[0, 1] {
                  Some(i)
              } else {
                  None
              }
          })
          .collect()
  }
This returns `Some(i)` again rather than `Some(i+1)` because we've prepended a `0` to the input.]

My comment is not about Rust specifically. It's just the language I chose. I assume the translation would be roughly similar in Java (with iterators), Python, Ruby, and the imperative version in C and C++ and similar.

And I believe in most of these languages, the mistake and its correction would both be easy to see and understand with close enough review. I'm not certain that would be true for the APL code. If you gave me that algorithm in any mainstream language, and asked me to look for mistakes, I could probably spot that problem upon reviewing it, and probably know how to fix it. (Probably, not necessarily.)

I do not believe that I could spot an analogous problem in the APL code nor understand easily how to fix it. My overall point is that the APL code seems entirely unapproachable to the un-initiated – even for a programmer experienced with numerous other languages.

(I might have similar difficulty with Haskell, but probably not as much; and perhaps Scala if it was written in a particularly obscure way.)

----

An additional topic that I would be interested to explore is: what kind of errors would be likely in the APL code? What would the APL code look like that had a similar off-by-one error? Would it be easy to spot by reading the code? And what would the "diff" look like correcting it? I unfortunately don't know enough about APL to explore this.

----

[In response to several edits of your comment that contained ad hominem attacks, and a lack of explanation of your criticism:]

Please follow the Hacker News guidelines:

https://news.ycombinator.com/newsguidelines.html

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

> Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.

> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

> “What would the APL code look like that had a similar off-by-one error? Would it be easy to spot by reading the code?

It kinda does have an off by one error; (Dyalog) APL defaults to indexing from one and the code assumes indexing from zero. The blog author isa well known APL-er and other “modern APL” users often assume indexing from zero. And the change/fix is the quad-io in my other comment so it’s not easily visible in the code because it’s not in the code at all but it is an easy fix.

At least, that’s one kind of off-by-one error; APL doing whole array transforms in each operation tends not to have as much room for off-by-one errors when you aren’t counting through things in the typical imperative way.