| So, admitted jq fanboy here, but I found a lot of the criticism from the articale really sensible. I think jq has a pretty elegant data model, but the syntax is often very clunky to work with. So here is a half thought-out idea how you might improve the syntax for the "stateful operations" usecase the OP outlined: I think it's not quite true that different elements of a sequence can never interact. The OP mentioned reduce/foreach, but it's also what any function that takes argument does: If you have an expression 'foo | bar', then bar is called once for every element foo emits. However, foo could also a function that takes arguments. Then you can specify bar as an argument of foo like this: 'foo(bar)'. In this situation, execution of bar is completely controlled by foo. In particular, foo gets to see all elements that foo emits, not just one each. I believe this is how e.g. [x] can collect all elements of x into an array. In the same way, you could write a function 'add_all(x)' which calls x and adds up all emitted elements to a sum. However, this wouldn't help you with collecting all input lines, as there is nothing for you function to "wrap around". Or at least, there used to be nothing, but I think in one of the recent build, an "inputs" function was added, which emits all remaining inputs. So now, you can write e.g. '[., inputs]' to reimplement slurp. In the same way, you could sum up all input lines by writing 'add_all(., inputs)'. However, this is still ugly and unintuitive to write, so I think introducting some syntactic sugar for this would be useful. E.g., you could imagine a "collect operator", e.g. '>>' which treats everything left of it as the first argument to the function to the right of it. e.g., writing 'a >> b' would desugar to 'b(a)'. Writing 'a | b >> c' would desugar to 'c(a | b)'. Any steps further to the right are not affected: 'a | b >> c | d' would desugar to 'c(a | b) | d'. Scope to the left could be controlled with parantheses: 'a | (b >> c)' would desugar to 'a | c(b)'. To make this more useful for aggregating on input lines, you could add a special rule that, if the operator is used with no parantheses, it will implicitly prepend '(., inputs)' as the first step. So if the entire top-level expression is 'a | b >> c', it would desugar to 'c((., inputs) | a | b)'. This would make many usecases that require keeping state much more straight-forward. E.g. collecting all the "baz" fields into an array could be written as '.baz >> []' which would desugar to '[(., inputs) | .baz]' Summing up all the bazzes could be written as '.baz >> add_all' which would desugar to 'add_all((., inputs) | .baz)' ...and so on. On the other hand, this could also lead to new confusion, as you could also write stuff like '... | (.baz >> map) | ...' which would really mean 'map(.baz)' or 'foo >> bar >> baz' which would desugar to the extremely cryptic expression 'baz((., inputs) | bar((., inputs) | foo))'. So I'm not quite sure. Any thoughts about the idea? |
The pipe operator that's in its final stages of approval for JavaScript uses '|>' as its sigil, which is a decent compromise between not conflicting with existing operators, being compatible with developers' existing pattern matching, and representing what it does somewhat. And 'a | b |> c' is ok.
You could just have '@c a | b' mean "everything from @c onwards to the end of the string is the argument to c" i.e. 'c(a | b)' and have 'c a | b' be 'c(a) | b', then anything more complicated just requires using the parentheses operator to enclose an expression i.e. 'c (a | b)' or just 'c(a | b)' if your tokenizer is a bit cleverer :) Actually I like this idea, because '@' is syntactic sugar for () around the rest of the query, and a function then operates on the value of the expression following it.