Hacker News new | ask | show | jobs
by fhsm 1339 days ago
It sounds like we do similar sorts of things (based on tool list) but each time I poke at an APL like system I back away clear on learning curve and unclear on value.

At core, my job is arithmetic on 3D arrays of approx 10x1000x100,000,000.

The rub is that for every LOC of written manipulating those structures I’ve got 100 LOCs doing IO (broadly defined) and then 1,000-10,000 doing some form of ETL, QC, normalization (I.e. find and validate the correctness of the magic numbers that go in the cells of the big array).

Do you think J/APL would be of any use to me and if so where in your similar projects’ life cycle does it crop up?

1 comments

Possibly so. We do simple IO primarily from parquet or csv data, and I am working on the Apache Arrow/Flight package currently. Our data sets typically fit in RAM but either if you have enough RAM or a file amenable to memory mapping, you shouldn’t have a problem. J has a memory mapping utility. Numerical data is straightforward. We do ETL, QC, and normalization in J. Avoid transposes where you can, but reshapes are trivial and done with metadata, so fast. Overall what you describe sounds like a fun project to try in J.