Hacker News new | ask | show | jobs
by xelxebar 32 days ago
> At some point you get what APL is all about, and you can move on with life without too many regrets.

Unfortunately, this seems to be a common experience. A lot of smart people only engage with APL via toy puzzles, like you did, and bounce off because that gives no insight about how to use the language in real life. IME, to really start getting APL you need to write and rewrite a full application 20 times.

It helps to read code from the masters, too [0, 1, 2, 3, 4]. These all approach architecture in different ways: pedagogical FP style, OOP heavy, data-oriented design, event-driven state-machine, or a mix of the above.

[0]:https://dfns.dyalog.com/

[1]:https://github.com/Co-dfns/MicroUI-APL

[2]:https://github.com/Dyalog/ewc

[3]:https://github.com/Co-dfns/Co-dfns

[4]:https://github.com/Dyalog/Jarvis/blob/master/Source/Jarvis.d...

> As you can see, the famous prime generator is not even the Eratostenes' sieve, but a simple N^2 divisor counting computation.

Well, that's because you wrote a divisor function, not a seive. Arguably, the ease of typing an outer product (i.e ∘.|⍨⍳N) can tempt us into writing quadratic algorithms unnecessarily, but this is just an experience issue, IMO.

If we want a seive, we can just write one directly:

    p⊣{ω~n×1+⍳⌊N÷p⍪←n←ω↑⍨1⌊≢ω}⍣≡1↓1+⍳N⊣p←⍬
The algorithm is O(N log log N) as expected of a naive Eratosthenes implementation. You'll need ⎕IO←0 if you want to try it out.

There's also a faster seive by Roger Hui [0] in the dfns workspace as well as a family of prime number functions [1] for things more than just prime generation.

[0]:https://dfns.dyalog.com/n_sieve.htm

[1]:https://dfns.dyalog.com/n_pco.htm

2 comments

That's not O(N log log N), it's more like N^2. Prime sieves are hard to implement well with immutable arrays for obvious reasons; there are some cool methods but they're definitely harder. I'm ashamed to be part of a community that won't cop to this.

The algorithm iterates over numbers ⍺ from 2 to N, removing the multiples that are greater than ⍺ and no greater than N from p. If the removal with ~ has to inspect all of p, then all the primes are there so we have asymptotically at least N/log N entries by the prime number theorem and we get N^2/log N time (when ⍺ is over N/2, no multiples are in range so this can be skipped, but that just cuts a constant factor 2 from the time). Conceivably p could be marked sorted, so the entries to remove could be found with a binary search. This is a bit harder to analyze, but I think each prime under √N will cause the list to change, and incur N/log N data movement. So you get at least (N/log N)^(3/2) cost, still quite a bit worse than linear.

Edit: changed the algorithm while I was writing... the new one is better, it keeps one list p of primes and one list ω of not-yet-marked-out numbers. However, primes are removed from ω one at a time, so that each of the N/log N primes has to be moved for each one before it, giving at least (N/log N)^2 cost (I mean, maybe an interpreter could binary search and also recognize when ~ only drops the first entry and do that by slicing? But the (N/log N)^(3/2) from above definitely holds). Mutating a bit array in place is pretty important to classical sieve performance.

It's possible to get O(N^(3/2)/log N) in an ordinary interpreter with some changes to the code, assuming linear-time ~ with hashing. The idea is to leave the primes in ⍵ and stop when it stops changing, which will happen at the first prime over √N. To get the complexity multiply by O(N) for each step. It's also a small change to get the multiples to remove to start at n×n instead of n.

    i←¯1⋄{⍵~n×n↓⍳1+⌊N÷n←i⊃⍵⊣i+←1}⍣≡1↓1+⍳N
I think this is about as good as can be done with ~ instead of marking out bits. And I wouldn't say it's as easy as the imperative version!
Eep. You're right. Evidently, I didn't even know what a sieve was in this context and wrote a search instead. You got me to do a bit of research. Actually, this discussion is exactly where I think APL shows one of it's strengths. It feels like a human communication tool more than any other PL I've mucked about with. The hard parts here are not language issues but fundamental understanding ones.

It's a tad tricky to carefully analyze the asymptotics of my above prime generator, since the search space of Without (~) shrinks on each iteration. I think Merten's theorem gives an estimate of e^-γ/log(p_i), which we need to sum for all primes up to N. Taking prime density 1/log(n) and integrating N/(log x)^2 over our range is O(N^2/log N), I think.

> Mutating a bit array in place is pretty important to classical sieve performance.

Challenge accepted:

    p⊣{x[⍵+n×⍳⌊N÷n←p⍪←x[⍵]]←0 ⋄ 1+⍣{(x⍪1)[⍵+1]≠0}⊢⍵}⍣{⍵=N-2}0⊣x p←(1↓1+⍳N)⍬
We just directly set roughly N/p items to 0 on each iteration—proper sieve semantics—which should give O(N log log N), unless I'm missing something.
>A lot of smart people only engage with APL via toy puzzles

I think part of this is because that is how most (possibly all) sources teach APL and array languages, solving puzzles and manipulating arrays. If you learn to write programs in an Algol derived language, you can write programs in most common languages without having to learn how to write programs, you just need to learn the language. Modern array languages sort of allow us to use them like the Algol derived languages, but this does not seem to work out so well and often does not work to the strengths of array languages.