Hacker News new | ask | show | jobs
by pizza 1648 days ago
Amazing.. 100 gratitudes for you from me for taking the time out to explain all of that. Very impressed by this and all the work that’s been done.

Oh and for un-padding, I meant like how do I do the inverse of fill_none . pad_none

Also saw there was some stuff about algebraic types (eg semigroup reductions) - is that kind of algorithm-level type annotation a direction you all are interested in exploring further?

1 comments

Un-padding: something like string-trimming (e.g. `str,rstrip`), but for missing values at the ends of lists... There isn't a function for that.

If you happen to know that the only uses of missing values are at the ends of lists, `ak.is_none` and `ak.sum` (with the appropriate `axis`) can count them, and you could perhaps construct a slice from that (negative to count from the end, and therefore slice off the missing values only). I'd have to think about it, but that would be the beginning of a columnar implementation of "unpad_none".

As for the algebraic types, I was using the terminology to explain what the reducers do. Some operations, like sum and product, have identities, and some don't, like argmin.

As for type annotations, I don't know what you mean. We're not using Python type annotations, but they'd be too coarse to describe what these operations do. Awkward-specialized type annotations might be overkill. For Dask, which needs to be able to predict types, we're passing tracer objects through the codebase to observe the types change without actually computing values, so it's a type-propagation by execution.

Ah interesting. Like so w the algebraic stuff I meant like, well if you have a semigroup or a monoid homomorphism it translates nicely into a parallel distributed computation problem- hence the semigroup flag works nicely with the reduction ops

So I was wondering how I could exploit Awkward’s typing system to use/implement some goodies from Haskell a la https://wiki.haskell.org/Typeclassopedia

Like, for instance, what if I could make an array of heterogenous ufuncs, and apply that to a similarly shaped array (like an Applicative).. like if I wanted to implement eg graph re-writing by applying a rules ufunc array to an adjancency array, etc, or even , to get very meta, apply a rules function array to another rules function array

Or if I wanted to compute eg the fixed point of a series of those applications, etc.

Or maybe if I wanted to use Arrow types to abstractly represent computations within each cell, do some fancy stuff in each cell, perform some rudimentary ’compiler optimization’ by inspecting which cells would end up doing unnecessary work (in the context of whatever problem I am doing; eg suppose I only permitted 3 chained ufunc calls per cell or something weird like that), that would be really cool too

Or eg if for some unknown reason I wanted each cell to fire off 2 concurrent ufuncs within each cell, and I only was interested in the result that ‘won’ the data race for each cell, I could use eg an Alternative in the style of the Concurrently library.

Or if I wanted eg each cell to be like a MonadPlus; do some work in the cell but also provide builtin “recovery” capabilities per cell if the cell evaluated to empty/missing/None

Ah now another interesting possibility could be a matrix of lambda calculus statements..!

Musings and sketches.. :)

Very very cool work indeed!